summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--Documentation/RCU/RTFP.txt858
-rw-r--r--Documentation/RCU/rcubarrier.txt12
-rw-r--r--Documentation/memory-barriers.txt10
-rw-r--r--Documentation/timers/NO_HZ.txt44
-rw-r--r--include/linux/debugobjects.h6
-rw-r--r--include/linux/jiffies.h8
-rw-r--r--include/linux/rculist.h5
-rw-r--r--include/linux/rcupdate.h22
-rw-r--r--init/Kconfig1
-rw-r--r--kernel/rcu.h10
-rw-r--r--kernel/rcupdate.c100
-rw-r--r--kernel/rcutree.c150
-rw-r--r--kernel/rcutree.h17
-rw-r--r--kernel/rcutree_plugin.h424
-rw-r--r--kernel/time/Kconfig50
-rw-r--r--lib/debugobjects.c20
16 files changed, 1233 insertions, 504 deletions
diff --git a/Documentation/RCU/RTFP.txt b/Documentation/RCU/RTFP.txt
index 7f40c72a9c5..273e654d7d0 100644
--- a/Documentation/RCU/RTFP.txt
+++ b/Documentation/RCU/RTFP.txt
@@ -39,7 +39,7 @@ in read-mostly situations. This algorithm does take pains to avoid
write-side contention and parallelize the other write-side overheads by
providing a fine-grained locking design, however, it would be interesting
to see how much of the performance advantage reported in 1990 remains
-in 2004.
+today.
At about this same time, Adams [Adams91] described ``chaotic relaxation'',
where the normal barriers between successive iterations of convergent
@@ -86,9 +86,9 @@ DYNIX/ptx kernel. The corresponding conference paper appeared in 1998
[McKenney98].
In 1999, the Tornado and K42 groups described their "generations"
-mechanism, which quite similar to RCU [Gamsa99]. These operating systems
-made pervasive use of RCU in place of "existence locks", which greatly
-simplifies locking hierarchies.
+mechanism, which is quite similar to RCU [Gamsa99]. These operating
+systems made pervasive use of RCU in place of "existence locks", which
+greatly simplifies locking hierarchies and helps avoid deadlocks.
2001 saw the first RCU presentation involving Linux [McKenney01a]
at OLS. The resulting abundance of RCU patches was presented the
@@ -106,8 +106,11 @@ these techniques still impose significant read-side overhead in the
form of memory barriers. Researchers at Sun worked along similar lines
in the same timeframe [HerlihyLM02]. These techniques can be thought
of as inside-out reference counts, where the count is represented by the
-number of hazard pointers referencing a given data structure (rather than
-the more conventional counter field within the data structure itself).
+number of hazard pointers referencing a given data structure rather than
+the more conventional counter field within the data structure itself.
+The key advantage of inside-out reference counts is that they can be
+stored in immortal variables, thus allowing races between access and
+deletion to be avoided.
By the same token, RCU can be thought of as a "bulk reference count",
where some form of reference counter covers all reference by a given CPU
@@ -179,7 +182,25 @@ tree using software transactional memory to protect concurrent updates
(strange, but true!) [PhilHoward2011RCUTMRBTree], yet another variant of
RCU-protected resizeable hash tables [Triplett:2011:RPHash], the 3.0 RCU
trainwreck [PaulEMcKenney2011RCU3.0trainwreck], and Neil Brown's "Meet the
-Lockers" LWN article [NeilBrown2011MeetTheLockers].
+Lockers" LWN article [NeilBrown2011MeetTheLockers]. Some academic
+work looked at debugging uses of RCU [Seyster:2011:RFA:2075416.2075425].
+
+In 2012, Josh Triplett received his Ph.D. with his dissertation
+covering RCU-protected resizable hash tables and the relationship
+between memory barriers and read-side traversal order: If the updater
+is making changes in the opposite direction from the read-side traveral
+order, the updater need only execute a memory-barrier instruction,
+but if in the same direction, the updater needs to wait for a grace
+period between the individual updates [JoshTriplettPhD]. Also in 2012,
+after seventeen years of attempts, an RCU paper made it into a top-flight
+academic journal, IEEE Transactions on Parallel and Distributed Systems
+[MathieuDesnoyers2012URCU]. A group of researchers in Spain applied
+user-level RCU to crowd simulation [GuillermoVigueras2012RCUCrowd], and
+another group of researchers in Europe produced a formal description of
+RCU based on separation logic [AlexeyGotsman2012VerifyGraceExtended],
+which was published in the 2013 European Symposium on Programming
+[AlexeyGotsman2013ESOPRCU].
+
Bibtex Entries
@@ -193,13 +214,12 @@ Bibtex Entries
,volume="5"
,number="3"
,pages="354-382"
-,note="Available:
-\url{http://portal.acm.org/citation.cfm?id=320619&dl=GUIDE,}
-[Viewed December 3, 2007]"
,annotation={
Use garbage collector to clean up data after everyone is done with it.
.
Oldest use of something vaguely resembling RCU that I have found.
+ http://portal.acm.org/citation.cfm?id=320619&dl=GUIDE,
+ [Viewed December 3, 2007]
}
}
@@ -309,7 +329,7 @@ for Programming Languages and Operating Systems}"
,doi = {http://doi.acm.org/10.1145/42392.42399}
,publisher = {ACM}
,address = {New York, NY, USA}
-,annotation= {
+,annotation={
At the top of page 307: "Conflicts with deposits and withdrawals
are necessary if the reported total is to be up to date. They
could be avoided by having total return a sum that is slightly
@@ -346,8 +366,9 @@ for Programming Languages and Operating Systems}"
}
}
-@Book{Adams91
-,Author="Gregory R. Adams"
+# Was Adams91, see also syncrefs.bib.
+@Book{Andrews91textbook
+,Author="Gregory R. Andrews"
,title="Concurrent Programming, Principles, and Practices"
,Publisher="Benjamin Cummins"
,Year="1991"
@@ -398,39 +419,39 @@ for Programming Languages and Operating Systems}"
}
}
-@conference{Pu95a,
-Author = "Calton Pu and Tito Autrey and Andrew Black and Charles Consel and
+@conference{Pu95a
+,Author = "Calton Pu and Tito Autrey and Andrew Black and Charles Consel and
Crispin Cowan and Jon Inouye and Lakshmi Kethana and Jonathan Walpole and
-Ke Zhang",
-Title = "Optimistic Incremental Specialization: Streamlining a Commercial
-Operating System",
-Booktitle = "15\textsuperscript{th} ACM Symposium on
-Operating Systems Principles (SOSP'95)",
-address = "Copper Mountain, CO",
-month="December",
-year="1995",
-pages="314-321",
-annotation="
+Ke Zhang"
+,Title = "Optimistic Incremental Specialization: Streamlining a Commercial
+,Operating System"
+,Booktitle = "15\textsuperscript{th} ACM Symposium on
+,Operating Systems Principles (SOSP'95)"
+,address = "Copper Mountain, CO"
+,month="December"
+,year="1995"
+,pages="314-321"
+,annotation={
Uses a replugger, but with a flag to signal when people are
using the resource at hand. Only one reader at a time.
-"
-}
-
-@conference{Cowan96a,
-Author = "Crispin Cowan and Tito Autrey and Charles Krasic and
-Calton Pu and Jonathan Walpole",
-Title = "Fast Concurrent Dynamic Linking for an Adaptive Operating System",
-Booktitle = "International Conference on Configurable Distributed Systems
-(ICCDS'96)",
-address = "Annapolis, MD",
-month="May",
-year="1996",
-pages="108",
-isbn="0-8186-7395-8",
-annotation="
+}
+}
+
+@conference{Cowan96a
+,Author = "Crispin Cowan and Tito Autrey and Charles Krasic and
+,Calton Pu and Jonathan Walpole"
+,Title = "Fast Concurrent Dynamic Linking for an Adaptive Operating System"
+,Booktitle = "International Conference on Configurable Distributed Systems
+(ICCDS'96)"
+,address = "Annapolis, MD"
+,month="May"
+,year="1996"
+,pages="108"
+,isbn="0-8186-7395-8"
+,annotation={
Uses a replugger, but with a counter to signal when people are
using the resource at hand. Allows multiple readers.
-"
+}
}
@techreport{Slingwine95
@@ -493,14 +514,13 @@ Problems"
,Year="1998"
,pages="509-518"
,Address="Las Vegas, NV"
-,note="Available:
-\url{http://www.rdrop.com/users/paulmck/RCU/rclockpdcsproof.pdf}
-[Viewed December 3, 2007]"
,annotation={
Describes and analyzes RCU mechanism in DYNIX/ptx. Describes
application to linked list update and log-buffer flushing.
Defines 'quiescent state'. Includes both measured and analytic
evaluation.
+ http://www.rdrop.com/users/paulmck/RCU/rclockpdcsproof.pdf
+ [Viewed December 3, 2007]
}
}
@@ -514,13 +534,12 @@ Operating System Design and Implementation}"
,Year="1999"
,pages="87-100"
,Address="New Orleans, LA"
-,note="Available:
-\url{http://www.usenix.org/events/osdi99/full_papers/gamsa/gamsa.pdf}
-[Viewed August 30, 2006]"
,annotation={
Use of RCU-like facility in K42/Tornado. Another independent
invention of RCU.
See especially pages 7-9 (Section 5).
+ http://www.usenix.org/events/osdi99/full_papers/gamsa/gamsa.pdf
+ [Viewed August 30, 2006]
}
}
@@ -611,9 +630,9 @@ Orran Krieger and Rusty Russell and Dipankar Sarma and Maneesh Soni"
,note="Available:
\url{http://marc.theaimsgroup.com/?l=linux-kernel&m=100259266316456&w=2}
[Viewed June 23, 2004]"
-,annotation="
+,annotation={
Memory-barrier and Alpha thread. 100 messages, not too bad...
-"
+}
}
@unpublished{Spraul01
@@ -624,10 +643,10 @@ Orran Krieger and Rusty Russell and Dipankar Sarma and Maneesh Soni"
,note="Available:
\url{http://marc.theaimsgroup.com/?l=linux-kernel&m=100264675012867&w=2}
[Viewed June 23, 2004]"
-,annotation="
+,annotation={
Suggested burying memory barriers in Linux's list-manipulation
primitives.
-"
+}
}
@unpublished{LinusTorvalds2001a
@@ -638,6 +657,8 @@ Orran Krieger and Rusty Russell and Dipankar Sarma and Maneesh Soni"
,note="Available:
\url{http://lkml.org/lkml/2001/10/13/105}
[Viewed August 21, 2004]"
+,annotation={
+}
}
@unpublished{Blanchard02a
@@ -657,10 +678,10 @@ Orran Krieger and Rusty Russell and Dipankar Sarma and Maneesh Soni"
,Month="June"
,Year="2002"
,pages="289-300"
-,annotation="
+,annotation={
Measured scalability of Linux 2.4 kernel's directory-entry cache
(dcache), and measured some scalability enhancements.
-"
+}
}
@Conference{McKenney02a
@@ -674,10 +695,10 @@ Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell"
,note="Available:
\url{http://www.linux.org.uk/~ajh/ols2002_proceedings.pdf.gz}
[Viewed June 23, 2004]"
-,annotation="
+,annotation={
Presented and compared a number of RCU implementations for the
Linux kernel.
-"
+}
}
@unpublished{Sarma02a
@@ -688,9 +709,9 @@ Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell"
,note="Available:
\url{http://marc.theaimsgroup.com/?l=linux-kernel&m=102645767914212&w=2}
[Viewed June 23, 2004]"
-,annotation="
+,annotation={
Compare fastwalk and RCU for dcache. RCU won.
-"
+}
}
@unpublished{Barbieri02
@@ -701,9 +722,9 @@ Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell"
,note="Available:
\url{http://marc.theaimsgroup.com/?l=linux-kernel&m=103082050621241&w=2}
[Viewed: June 23, 2004]"
-,annotation="
+,annotation={
Suggested RCU for vfs\_shared\_cred.
-"
+}
}
@unpublished{Dickins02a
@@ -722,10 +743,10 @@ Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell"
,note="Available:
\url{http://marc.theaimsgroup.com/?l=linux-kernel&m=103462075416638&w=2}
[Viewed June 23, 2004]"
-,annotation="
+,annotation={
Performance of dcache RCU on kernbench for 16x NUMA-Q and 1x,
2x, and 4x systems. RCU does no harm, and helps on 16x.
-"
+}
}
@unpublished{LinusTorvalds2003a
@@ -736,14 +757,14 @@ Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell"
,note="Available:
\url{http://lkml.org/lkml/2003/3/9/205}
[Viewed March 13, 2006]"
-,annotation="
+,annotation={
Linus suggests replacing brlock with RCU and/or seqlocks:
.
'It's entirely possible that the current user could be replaced
by RCU and/or seqlocks, and we could get rid of brlocks entirely.'
.
Steve Hemminger responds by replacing them with RCU.
-"
+}
}
@article{Appavoo03a
@@ -758,9 +779,9 @@ B. Rosenburg and M. Stumm and J. Xenidis"
,volume="42"
,number="1"
,pages="60-76"
-,annotation="
+,annotation={
Use of RCU to enable hot-swapping for autonomic behavior in K42.
-"
+}
}
@unpublished{Seigh03
@@ -769,9 +790,9 @@ B. Rosenburg and M. Stumm and J. Xenidis"
,Year="2003"
,Month="March"
,note="email correspondence"
-,annotation="
+,annotation={
Described the relationship of the VM/XA passive serialization to RCU.
-"
+}
}
@Conference{Arcangeli03
@@ -785,14 +806,12 @@ Dipankar Sarma"
,year="2003"
,month="June"
,pages="297-310"
-,note="Available:
-\url{http://www.rdrop.com/users/paulmck/RCU/rcu.FREENIX.2003.06.14.pdf}
-[Viewed November 21, 2007]"
-,annotation="
+,annotation={
Compared updated RCU implementations for the Linux kernel, and
described System V IPC use of RCU, including order-of-magnitude
performance improvements.
-"
+ http://www.rdrop.com/users/paulmck/RCU/rcu.FREENIX.2003.06.14.pdf
+}
}
@Conference{Soules03a
@@ -820,10 +839,10 @@ Michal Ostrowski and Bryan Rosenburg and Jimi Xenidis"
,note="Available:
\url{http://www.linuxjournal.com/article/6993}
[Viewed November 14, 2007]"
-,annotation="
+,annotation={
Reader-friendly intro to RCU, with the infamous old-man-and-brat
cartoon.
-"
+}
}
@unpublished{Sarma03a
@@ -832,7 +851,9 @@ Michal Ostrowski and Bryan Rosenburg and Jimi Xenidis"
,month="December"
,year="2003"
,note="Message ID: 20031222180114.GA2248@in.ibm.com"
-,annotation="dipankar/ct.2004.03.27/RCUll.2003.12.22.patch"
+,annotation={
+ dipankar/ct.2004.03.27/RCUll.2003.12.22.patch
+}
}
@techreport{Friedberg03a
@@ -844,11 +865,11 @@ Michal Ostrowski and Bryan Rosenburg and Jimi Xenidis"
,number="US Patent 6,662,184"
,month="December"
,pages="112"
-,annotation="
+,annotation={
Applies RCU to a wildcard-search Patricia tree in order to permit
synchronization-free lookup. RCU is used to retain removed nodes
for a grace period before freeing them.
-"
+}
}
@article{McKenney04a
@@ -860,12 +881,11 @@ Michal Ostrowski and Bryan Rosenburg and Jimi Xenidis"
,volume="1"
,number="118"
,pages="38-46"
-,note="Available:
-\url{http://www.linuxjournal.com/node/7124}
-[Viewed December 26, 2010]"
-,annotation="
+,annotation={
Reader friendly intro to dcache and RCU.
-"
+ http://www.linuxjournal.com/node/7124
+ [Viewed December 26, 2010]
+}
}
@Conference{McKenney04b
@@ -879,10 +899,10 @@ Michal Ostrowski and Bryan Rosenburg and Jimi Xenidis"
\url{http://www.linux.org.au/conf/2004/abstracts.html#90}
\url{http://www.rdrop.com/users/paulmck/RCU/lockperf.2004.01.17a.pdf}
[Viewed June 23, 2004]"
-,annotation="
+,annotation={
Compares performance of RCU to that of other locking primitives
over a number of CPUs (x86, Opteron, Itanium, and PPC).
-"
+}
}
@unpublished{Sarma04a
@@ -891,7 +911,9 @@ Michal Ostrowski and Bryan Rosenburg and Jimi Xenidis"
,month="March"
,year="2004"
,note="\url{http://marc.theaimsgroup.com/?l=linux-kernel&m=108003746402892&w=2}"
-,annotation="Head of thread: dipankar/2004.03.23/rcu-low-lat.1.patch"
+,annotation={
+ Head of thread: dipankar/2004.03.23/rcu-low-lat.1.patch
+}
}
@unpublished{Sarma04b
@@ -900,7 +922,9 @@ Michal Ostrowski and Bryan Rosenburg and Jimi Xenidis"
,month="March"
,year="2004"
,note="\url{http://marc.theaimsgroup.com/?l=linux-kernel&m=108016474829546&w=2}"
-,annotation="dipankar/rcuth.2004.03.24/rcu-throttle.patch"
+,annotation={
+ dipankar/rcuth.2004.03.24/rcu-throttle.patch
+}
}
@unpublished{Spraul04a
@@ -911,9 +935,9 @@ Michal Ostrowski and Bryan Rosenburg and Jimi Xenidis"
,note="Available:
\url{http://marc.theaimsgroup.com/?l=linux-kernel&m=108546407726602&w=2}
[Viewed June 23, 2004]"
-,annotation="
+,annotation={
Hierarchical-bitmap patch for RCU infrastructure.
-"
+}
}
@unpublished{Steiner04a
@@ -950,10 +974,12 @@ Realtime Applications"
,year="2004"
,month="June"
,pages="182-191"
-,annotation="
+,annotation={
Describes and compares a number of modifications to the Linux RCU
implementation that make it friendly to realtime applications.
-"
+ https://www.usenix.org/conference/2004-usenix-annual-technical-conference/making-rcu-safe-deep-sub-millisecond-response
+ [Viewed July 26, 2012]
+}
}
@phdthesis{PaulEdwardMcKenneyPhD
@@ -964,14 +990,13 @@ in Operating System Kernels"
,school="OGI School of Science and Engineering at
Oregon Health and Sciences University"
,year="2004"
-,note="Available:
-\url{http://www.rdrop.com/users/paulmck/RCU/RCUdissertation.2004.07.14e1.pdf}
-[Viewed October 15, 2004]"
-,annotation="
+,annotation={
Describes RCU implementations and presents design patterns
corresponding to common uses of RCU in several operating-system
kernels.
-"
+ http://www.rdrop.com/users/paulmck/RCU/RCUdissertation.2004.07.14e1.pdf
+ [Viewed October 15, 2004]
+}
}
@unpublished{PaulEMcKenney2004rcu:dereference
@@ -982,9 +1007,9 @@ Oregon Health and Sciences University"
,note="Available:
\url{http://lkml.org/lkml/2004/8/6/237}
[Viewed June 8, 2010]"
-,annotation="
+,annotation={
Introduce rcu_dereference().
-"
+}
}
@unpublished{JimHouston04a
@@ -995,11 +1020,11 @@ Oregon Health and Sciences University"
,note="Available:
\url{http://lkml.org/lkml/2004/8/30/87}
[Viewed February 17, 2005]"
-,annotation="
+,annotation={
Uses active code in rcu_read_lock() and rcu_read_unlock() to
make RCU happen, allowing RCU to function on CPUs that do not
receive a scheduling-clock interrupt.
-"
+}
}
@unpublished{TomHart04a
@@ -1010,9 +1035,9 @@ Oregon Health and Sciences University"
,note="Available:
\url{http://www.cs.toronto.edu/~tomhart/masters_thesis.html}
[Viewed October 15, 2004]"
-,annotation="
+,annotation={
Proposes comparing RCU to lock-free methods for the Linux kernel.
-"
+}
}
@unpublished{Vaddagiri04a
@@ -1023,9 +1048,9 @@ Oregon Health and Sciences University"
,note="Available:
\url{http://marc.theaimsgroup.com/?t=109395731700004&r=1&w=2}
[Viewed October 18, 2004]"
-,annotation="
+,annotation={
Srivatsa's RCU patch for tcp_ehash lookup.
-"
+}
}
@unpublished{Thirumalai04a
@@ -1036,9 +1061,9 @@ Oregon Health and Sciences University"
,note="Available:
\url{http://marc.theaimsgroup.com/?t=109144217400003&r=1&w=2}
[Viewed October 18, 2004]"
-,annotation="
+,annotation={
Ravikiran's lockfree FD patch.
-"
+}
}
@unpublished{Thirumalai04b
@@ -1049,9 +1074,9 @@ Oregon Health and Sciences University"
,note="Available:
\url{http://marc.theaimsgroup.com/?l=linux-kernel&m=109152521410459&w=2}
[Viewed October 18, 2004]"
-,annotation="
+,annotation={
Ravikiran's lockfree FD patch.
-"
+}
}
@unpublished{PaulEMcKenney2004rcu:assign:pointer
@@ -1062,9 +1087,9 @@ Oregon Health and Sciences University"
,note="Available:
\url{http://lkml.org/lkml/2004/10/23/241}
[Viewed June 8, 2010]"
-,annotation="
+,annotation={
Introduce rcu_assign_pointer().
-"
+}
}
@unpublished{JamesMorris04a
@@ -1073,12 +1098,12 @@ Oregon Health and Sciences University"
,day="15"
,month="November"
,year="2004"
-,note="Available:
-\url{http://marc.theaimsgroup.com/?l=linux-kernel&m=110054979416004&w=2}
-[Viewed December 10, 2004]"
-,annotation="
+,note="\url{http://marc.theaimsgroup.com/?l=linux-kernel&m=110054979416004&w=2}"
+,annotation={
James Morris posts Kaigai Kohei's patch to LKML.
-"
+ [Viewed December 10, 2004]
+ Kaigai's patch is at https://lkml.org/lkml/2004/9/27/52
+}
}
@unpublished{JamesMorris04b
@@ -1089,9 +1114,9 @@ Oregon Health and Sciences University"
,note="Available:
\url{http://www.livejournal.com/users/james_morris/2153.html}
[Viewed December 10, 2004]"
-,annotation="
+,annotation={
RCU helps SELinux performance. ;-) Made LWN.
-"
+}
}
@unpublished{PaulMcKenney2005RCUSemantics
@@ -1103,9 +1128,9 @@ Oregon Health and Sciences University"
,note="Available:
\url{http://www.rdrop.com/users/paulmck/RCU/rcu-semantics.2005.01.30a.pdf}
[Viewed December 6, 2009]"
-,annotation="
+,annotation={
Early derivation of RCU semantics.
-"
+}
}
@unpublished{PaulMcKenney2005e
@@ -1117,10 +1142,10 @@ Oregon Health and Sciences University"
,note="Available:
\url{http://lkml.org/lkml/2005/3/17/199}
[Viewed September 5, 2005]"
-,annotation="
+,annotation={
First posting showing how RCU can be safely adapted for
preemptable RCU read side critical sections.
-"
+}
}
@unpublished{EsbenNeilsen2005a
@@ -1132,12 +1157,12 @@ Oregon Health and Sciences University"
,note="Available:
\url{http://lkml.org/lkml/2005/3/18/122}
[Viewed March 30, 2006]"
-,annotation="
+,annotation={
Esben Neilsen suggests read-side suppression of grace-period
processing for crude-but-workable realtime RCU. The downside
- is indefinite grace periods...But this is OK for experimentation
+ is indefinite grace periods... But this is OK for experimentation
and testing.
-"
+}
}
@unpublished{TomHart05a
@@ -1149,10 +1174,10 @@ Data Structures"
,note="Available:
\url{ftp://ftp.cs.toronto.edu/csrg-technical-reports/515/}
[Viewed March 4, 2005]"
-,annotation="
+,annotation={
Comparison of RCU, QBSR, and EBSR. RCU wins for read-mostly
workloads. ;-)
-"
+}
}
@unpublished{JonCorbet2005DeprecateSyncKernel
@@ -1164,10 +1189,10 @@ Data Structures"
,note="Available:
\url{http://lwn.net/Articles/134484/}
[Viewed May 3, 2005]"
-,annotation="
+,annotation={
Jon Corbet describes deprecation of synchronize_kernel()
in favor of synchronize_rcu() and synchronize_sched().
-"
+}
}
@unpublished{PaulMcKenney05a
@@ -1178,10 +1203,10 @@ Data Structures"
,note="Available:
\url{http://lkml.org/lkml/2005/5/9/185}
[Viewed May 13, 2005]"
-,annotation="
+,annotation={
First publication of working lock-based deferred free patches
for the CONFIG_PREEMPT_RT environment.
-"
+}
}
@conference{PaulMcKenney05b
@@ -1194,10 +1219,10 @@ Data Structures"
,note="Available:
\url{http://www.rdrop.com/users/paulmck/RCU/realtimeRCU.2005.04.23a.pdf}
[Viewed May 13, 2005]"
-,annotation="
+,annotation={
Realtime turns into making RCU yet more realtime friendly.
http://lca2005.linux.org.au/Papers/Paul%20McKenney/Towards%20Hard%20Realtime%20Response%20from%20the%20Linux%20Kernel/LKS.2005.04.22a.pdf
-"
+}
}
@unpublished{PaulEMcKenneyHomePage
@@ -1208,9 +1233,9 @@ Data Structures"
,note="Available:
\url{http://www.rdrop.com/users/paulmck/}
[Viewed May 25, 2005]"
-,annotation="
+,annotation={
Paul McKenney's home page.
-"
+}
}
@unpublished{PaulEMcKenneyRCUPage
@@ -1221,9 +1246,9 @@ Data Structures"
,note="Available:
\url{http://www.rdrop.com/users/paulmck/RCU}
[Viewed May 25, 2005]"
-,annotation="
+,annotation={
Paul McKenney's RCU page.
-"
+}
}
@unpublished{JosephSeigh2005a
@@ -1232,10 +1257,10 @@ Data Structures"
,month="July"
,year="2005"
,note="Personal communication"
-,annotation="
+,annotation={
Joe Seigh announcing his atomic-ptr-plus project.
http://sourceforge.net/projects/atomic-ptr-plus/
-"
+}
}
@unpublished{JosephSeigh2005b
@@ -1247,9 +1272,9 @@ Data Structures"
,note="Available:
\url{http://sourceforge.net/projects/atomic-ptr-plus/}
[Viewed August 8, 2005]"
-,annotation="
+,annotation={
Joe Seigh's atomic-ptr-plus project.
-"
+}
}
@unpublished{PaulMcKenney2005c
@@ -1261,9 +1286,9 @@ Data Structures"
,note="Available:
\url{http://lkml.org/lkml/2005/8/1/155}
[Viewed March 14, 2006]"
-,annotation="
+,annotation={
First operating counter-based realtime RCU patch posted to LKML.
-"
+}
}
@unpublished{PaulMcKenney2005d
@@ -1275,11 +1300,11 @@ Data Structures"
,note="Available:
\url{http://lkml.org/lkml/2005/8/8/108}
[Viewed March 14, 2006]"
-,annotation="
+,annotation={
First operating counter-based realtime RCU patch posted to LKML,
but fixed so that various unusual combinations of configuration
parameters all function properly.
-"
+}
}
@unpublished{PaulMcKenney2005rcutorture
@@ -1291,9 +1316,25 @@ Data Structures"
,note="Available:
\url{http://lkml.org/lkml/2005/10/1/70}
[Viewed March 14, 2006]"
-,annotation="
+,annotation={
First rcutorture patch.
-"
+}
+}
+
+@unpublished{DavidSMiller2006HashedLocking
+,Author="David S. Miller"
+,Title="Re: [{PATCH}, {RFC}] {RCU} : {OOM} avoidance and lower latency"
+,month="January"
+,day="6"
+,year="2006"
+,note="Available:
+\url{https://lkml.org/lkml/2006/1/7/22}
+[Viewed February 29, 2012]"
+,annotation={
+ David Miller's view on hashed arrays of locks: used to really
+ like it, but time he saw an opportunity for this technique,
+ something else always proved superior. Partitioning or RCU. ;-)
+}
}
@conference{ThomasEHart2006a
@@ -1309,10 +1350,10 @@ Distributed Processing Symposium"
,note="Available:
\url{http://www.rdrop.com/users/paulmck/RCU/hart_ipdps06.pdf}
[Viewed April 28, 2008]"
-,annotation="
+,annotation={
Compares QSBR, HPBR, EBR, and lock-free reference counting.
http://www.cs.toronto.edu/~tomhart/perflab/ipdps06.tgz
-"
+}
}
@unpublished{NickPiggin2006radixtree
@@ -1324,9 +1365,9 @@ Distributed Processing Symposium"
,note="Available:
\url{http://lkml.org/lkml/2006/6/20/238}
[Viewed March 25, 2008]"
-,annotation="
+,annotation={
RCU-protected radix tree.
-"
+}
}
@Conference{PaulEMcKenney2006b
@@ -1341,9 +1382,9 @@ Suparna Bhattacharya"
\url{http://www.linuxsymposium.org/2006/view_abstract.php?content_key=184}
\url{http://www.rdrop.com/users/paulmck/RCU/OLSrtRCU.2006.08.11a.pdf}
[Viewed January 1, 2007]"
-,annotation="
+,annotation={
Described how to improve the -rt implementation of realtime RCU.
-"
+}
}
@unpublished{WikipediaRCU
@@ -1354,12 +1395,11 @@ Canis Rufus and Zoicon5 and Anome and Hal Eisen"
,month="July"
,day="8"
,year="2006"
-,note="Available:
-\url{http://en.wikipedia.org/wiki/Read-copy-update}
-[Viewed August 21, 2006]"
-,annotation="
+,note="\url{http://en.wikipedia.org/wiki/Read-copy-update}"
+,annotation={
Wikipedia RCU page as of July 8 2006.
-"
+ [Viewed August 21, 2006]
+}
}
@Conference{NickPiggin2006LocklessPageCache
@@ -1372,9 +1412,9 @@ Canis Rufus and Zoicon5 and Anome and Hal Eisen"
,note="Available:
\url{http://www.linuxsymposium.org/2006/view_abstract.php?content_key=184}
[Viewed January 11, 2009]"
-,annotation="
+,annotation={
Uses RCU-protected radix tree for a lockless page cache.
-"
+}
}
@unpublished{PaulEMcKenney2006c
@@ -1388,9 +1428,9 @@ Canis Rufus and Zoicon5 and Anome and Hal Eisen"
Revised:
\url{http://www.rdrop.com/users/paulmck/RCU/srcu.2007.01.14a.pdf}
[Viewed August 21, 2006]"
-,annotation="
+,annotation={
LWN article introducing SRCU.
-"
+}
}
@unpublished{RobertOlsson2006a
@@ -1399,12 +1439,11 @@ Revised:
,month="August"
,day="18"
,year="2006"
-,note="Available:
-\url{http://www.nada.kth.se/~snilsson/publications/TRASH/trash.pdf}
-[Viewed March 4, 2011]"
-,annotation="
+,note="\url{http://www.nada.kth.se/~snilsson/publications/TRASH/trash.pdf}"
+,annotation={
RCU-protected dynamic trie-hash combination.
-"
+ [Viewed March 4, 2011]
+}
}
@unpublished{ChristophHellwig2006RCU2SRCU
@@ -1426,10 +1465,10 @@ Revised:
,note="Available:
\url{http://www.rdrop.com/users/paulmck/RCU/linuxusage.html}
[Viewed January 14, 2007]"
-,annotation="
+,annotation={
Paul McKenney's RCU page showing graphs plotting Linux-kernel
usage of RCU.
-"
+}
}
@unpublished{PaulEMcKenneyRCUusageRawDataPage
@@ -1440,10 +1479,10 @@ Revised:
,note="Available:
\url{http://www.rdrop.com/users/paulmck/RCU/linuxusage/rculocktab.html}
[Viewed January 14, 2007]"
-,annotation="
+,annotation={
Paul McKenney's RCU page showing Linux usage of RCU in tabular
form, with links to corresponding cscope databases.
-"
+}
}
@unpublished{GauthamShenoy2006RCUrwlock
@@ -1455,13 +1494,13 @@ Revised:
,note="Available:
\url{http://lkml.org/lkml/2006/10/26/73}
[Viewed January 26, 2009]"
-,annotation="
+,annotation={
RCU-based reader-writer lock that allows readers to proceed with
no memory barriers or atomic instruction in absence of writers.
If writer do show up, readers must of course wait as required by
the semantics of reader-writer locking. This is a recursive
lock.
-"
+}
}
@unpublished{JensAxboe2006SlowSRCU
@@ -1474,11 +1513,11 @@ Revised:
,note="Available:
\url{http://lkml.org/lkml/2006/11/17/56}
[Viewed May 28, 2007]"
-,annotation="
+,annotation={
SRCU's grace periods are too slow for Jens, even after a
factor-of-three speedup.
Sped-up version of SRCU at http://lkml.org/lkml/2006/11/17/359.
-"
+}
}
@unpublished{OlegNesterov2006QRCU
@@ -1491,10 +1530,10 @@ Revised:
,note="Available:
\url{http://lkml.org/lkml/2006/11/19/69}
[Viewed May 28, 2007]"
-,annotation="
+,annotation={
First cut of QRCU. Expanded/corrected versions followed.
Used to be OlegNesterov2007QRCU, now time-corrected.
-"
+}
}
@unpublished{OlegNesterov2006aQRCU
@@ -1506,10 +1545,10 @@ Revised:
,note="Available:
\url{http://lkml.org/lkml/2006/11/29/330}
[Viewed November 26, 2008]"
-,annotation="
+,annotation={
Expanded/corrected version of QRCU.
Used to be OlegNesterov2007aQRCU, now time-corrected.
-"
+}
}
@unpublished{EvgeniyPolyakov2006RCUslowdown
@@ -1521,10 +1560,10 @@ Revised:
,note="Available:
\url{http://www.ioremap.net/node/41}
[Viewed October 28, 2008]"
-,annotation="
+,annotation={
Using RCU as a pure delay leads to a 2.5x slowdown in skbs in
the Linux kernel.
-"
+}
}
@inproceedings{ChrisMatthews2006ClusteredObjectsRCU
@@ -1541,7 +1580,8 @@ Revised:
,annotation={
Uses K42's RCU-like functionality to manage clustered-object
lifetimes.
-}}
+}
+}
@article{DilmaDaSilva2006K42
,author = {Silva, Dilma Da and Krieger, Orran and Wisniewski, Robert W. and Waterland, Amos and Tam, David and Baumann, Andrew}
@@ -1557,7 +1597,8 @@ Revised:
,address = {New York, NY, USA}
,annotation={
Describes relationship of K42 generations to RCU.
-}}
+}
+}
# CoreyMinyard2007list_splice_rcu
@unpublished{CoreyMinyard2007list:splice:rcu
@@ -1569,9 +1610,9 @@ Revised:
,note="Available:
\url{http://lkml.org/lkml/2007/1/3/112}
[Viewed May 28, 2007]"
-,annotation="
+,annotation={
Patch for list_splice_rcu().
-"
+}
}
@unpublished{PaulEMcKenney2007rcubarrier
@@ -1583,9 +1624,9 @@ Revised:
,note="Available:
\url{http://lwn.net/Articles/217484/}
[Viewed November 22, 2007]"
-,annotation="
+,annotation={
LWN article introducing the rcu_barrier() primitive.
-"
+}
}
@unpublished{PeterZijlstra2007SyncBarrier
@@ -1597,10 +1638,10 @@ Revised:
,note="Available:
\url{http://lkml.org/lkml/2007/1/28/34}
[Viewed March 27, 2008]"
-,annotation="
+,annotation={
RCU-like implementation for frequent updaters and rare readers(!).
Subsumed into QRCU. Maybe...
-"
+}
}
@unpublished{PaulEMcKenney2007BoostRCU
@@ -1609,14 +1650,13 @@ Revised:
,month="February"
,day="5"
,year="2007"
-,note="Available:
-\url{http://lwn.net/Articles/220677/}
-Revised:
-\url{http://www.rdrop.com/users/paulmck/RCU/RCUbooststate.2007.04.16a.pdf}
-[Viewed September 7, 2007]"
-,annotation="
+,note="\url{http://lwn.net/Articles/220677/}"
+,annotation={
LWN article introducing RCU priority boosting.
-"
+ Revised:
+ http://www.rdrop.com/users/paulmck/RCU/RCUbooststate.2007.04.16a.pdf
+ [Viewed September 7, 2007]
+}
}
@unpublished{PaulMcKenney2007QRCUpatch
@@ -1628,9 +1668,9 @@ Revised:
,note="Available:
\url{http://lkml.org/lkml/2007/2/25/18}
[Viewed March 27, 2008]"
-,annotation="
+,annotation={
Patch for QRCU supplying lock-free fast path.
-"
+}
}
@article{JonathanAppavoo2007K42RCU
@@ -1647,7 +1687,8 @@ Revised:
,address = {New York, NY, USA}
,annotation={
Role of RCU in K42.
-}}
+}
+}
@conference{RobertOlsson2007Trash
,Author="Robert Olsson and Stefan Nilsson"
@@ -1658,9 +1699,9 @@ Revised:
,note="Available:
\url{http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4281239}
[Viewed October 1, 2010]"
-,annotation="
+,annotation={
RCU-protected dynamic trie-hash combination.
-"
+}
}
@conference{PeterZijlstra2007ConcurrentPagecacheRCU
@@ -1673,10 +1714,10 @@ Revised:
,note="Available:
\url{http://ols.108.redhat.com/2007/Reprints/zijlstra-Reprint.pdf}
[Viewed April 14, 2008]"
-,annotation="
+,annotation={
Page-cache modifications permitting RCU readers and concurrent
updates.
-"
+}
}
@unpublished{PaulEMcKenney2007whatisRCU
@@ -1701,11 +1742,11 @@ Revised:
,note="Available:
\url{http://lwn.net/Articles/243851/}
[Viewed September 8, 2007]"
-,annotation="
+,annotation={
LWN article describing Promela and spin, and also using Oleg
Nesterov's QRCU as an example (with Paul McKenney's fastpath).
Merged patch at: http://lkml.org/lkml/2007/2/25/18
-"
+}
}
@unpublished{PaulEMcKenney2007WG21DDOatomics
@@ -1714,12 +1755,12 @@ Revised:
,month="August"
,day="3"
,year="2007"
-,note="Preprint:
+,note="Available:
\url{http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2664.htm}
[Viewed December 7, 2009]"
-,annotation="
+,annotation={
RCU for C++, parts 1 and 2.
-"
+}
}
@unpublished{PaulEMcKenney2007WG21DDOannotation
@@ -1728,12 +1769,12 @@ Revised:
,month="September"
,day="18"
,year="2008"
-,note="Preprint:
+,note="Available:
\url{http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2782.htm}
[Viewed December 7, 2009]"
-,annotation="
+,annotation={
RCU for C++, part 2, updated many times.
-"
+}
}
@unpublished{PaulEMcKenney2007PreemptibleRCUPatch
@@ -1745,10 +1786,10 @@ Revised:
,note="Available:
\url{http://lkml.org/lkml/2007/9/10/213}
[Viewed October 25, 2007]"
-,annotation="
+,annotation={
Final patch for preemptable RCU to -rt. (Later patches were
to mainline, eventually incorporated.)
-"
+}
}
@unpublished{PaulEMcKenney2007PreemptibleRCU
@@ -1760,9 +1801,9 @@ Revised:
,note="Available:
\url{http://lwn.net/Articles/253651/}
[Viewed October 25, 2007]"
-,annotation="
+,annotation={
LWN article describing the design of preemptible RCU.
-"
+}
}
@article{ThomasEHart2007a
@@ -1783,6 +1824,7 @@ Revised:
}
}
+# MathieuDesnoyers2007call_rcu_schedNeeded
@unpublished{MathieuDesnoyers2007call:rcu:schedNeeded
,Author="Mathieu Desnoyers"
,Title="Re: [patch 1/2] {Linux} Kernel Markers - Support Multiple Probes"
@@ -1792,9 +1834,9 @@ Revised:
,note="Available:
\url{http://lkml.org/lkml/2007/12/20/244}
[Viewed March 27, 2008]"
-,annotation="
+,annotation={
Request for call_rcu_sched() and rcu_barrier_sched().
-"
+}
}
@@ -1815,11 +1857,11 @@ Revised:
,note="Available:
\url{http://lwn.net/Articles/262464/}
[Viewed December 27, 2007]"
-,annotation="
+,annotation={
Lays out the three basic components of RCU: (1) publish-subscribe,
(2) wait for pre-existing readers to complete, and (2) maintain
multiple versions.
-"
+}
}
@unpublished{PaulEMcKenney2008WhatIsRCUUsage
@@ -1831,7 +1873,7 @@ Revised:
,note="Available:
\url{http://lwn.net/Articles/263130/}
[Viewed January 4, 2008]"
-,annotation="
+,annotation={
Lays out six uses of RCU:
1. RCU is a Reader-Writer Lock Replacement
2. RCU is a Restricted Reference-Counting Mechanism
@@ -1839,7 +1881,7 @@ Revised:
4. RCU is a Poor Man's Garbage Collector
5. RCU is a Way of Providing Existence Guarantees
6. RCU is a Way of Waiting for Things to Finish
-"
+}
}
@unpublished{PaulEMcKenney2008WhatIsRCUAPI
@@ -1851,10 +1893,10 @@ Revised:
,note="Available:
\url{http://lwn.net/Articles/264090/}
[Viewed January 10, 2008]"
-,annotation="
+,annotation={
Gives an overview of the Linux-kernel RCU API and a brief annotated RCU
bibliography.
-"
+}
}
#
@@ -1872,10 +1914,10 @@ Revised:
,note="Available:
\url{http://lkml.org/lkml/2008/1/29/208}
[Viewed March 27, 2008]"
-,annotation="
+,annotation={
Patch that prevents preemptible RCU from unnecessarily waking
up dynticks-idle CPUs.
-"
+}
}
@unpublished{PaulEMcKenney2008LKMLDependencyOrdering
@@ -1887,9 +1929,9 @@ Revised:
,note="Available:
\url{http://lkml.org/lkml/2008/2/2/255}
[Viewed October 18, 2008]"
-,annotation="
+,annotation={
Explanation of compilers violating dependency ordering.
-"
+}
}
@Conference{PaulEMcKenney2008Beijing
@@ -1916,24 +1958,26 @@ lot of {Linux} into your technology!!!"
,note="Available:
\url{http://lwn.net/Articles/279077/}
[Viewed April 24, 2008]"
-,annotation="
+,annotation={
Describes use of Promela and Spin to validate (and fix!) the
dynticks/RCU interface.
-"
+}
}
@article{DinakarGuniguntala2008IBMSysJ
,author="D. Guniguntala and P. E. McKenney and J. Triplett and J. Walpole"
,title="The read-copy-update mechanism for supporting real-time applications on shared-memory multiprocessor systems with {Linux}"
,Year="2008"
-,Month="April-June"
+,Month="May"
,journal="IBM Systems Journal"
,volume="47"
,number="2"
,pages="221-236"
-,annotation="
+,annotation={
RCU, realtime RCU, sleepable RCU, performance.
-"
+ http://www.research.ibm.com/journal/sj/472/guniguntala.pdf
+ [Viewed April 24, 2008]
+}
}
@unpublished{LaiJiangshan2008NewClassicAlgorithm
@@ -1945,11 +1989,11 @@ lot of {Linux} into your technology!!!"
,note="Available:
\url{http://lkml.org/lkml/2008/6/2/539}
[Viewed December 10, 2008]"
-,annotation="
+,annotation={
Updated RCU classic algorithm. Introduced multi-tailed list
for RCU callbacks and also pulling common code into
__call_rcu().
-"
+}
}
@article{PaulEMcKenney2008RCUOSR
@@ -1966,6 +2010,7 @@ lot of {Linux} into your technology!!!"
,address="New York, NY, USA"
,annotation={
Linux changed RCU to a far greater degree than RCU has changed Linux.
+ http://portal.acm.org/citation.cfm?doid=1400097.1400099
}
}
@@ -1978,10 +2023,10 @@ lot of {Linux} into your technology!!!"
,note="Available:
\url{http://lkml.org/lkml/2008/8/21/336}
[Viewed December 8, 2008]"
-,annotation="
+,annotation={
State-based RCU. One key thing that this patch does is to
separate the dynticks handling of NMIs and IRQs.
-"
+}
}
@unpublished{ManfredSpraul2008dyntickIRQNMI
@@ -1993,12 +2038,13 @@ lot of {Linux} into your technology!!!"
,note="Available:
\url{http://lkml.org/lkml/2008/9/6/86}
[Viewed December 8, 2008]"
-,annotation="
+,annotation={
Manfred notes a fix required to my attempt to separate irq
and NMI processing for hierarchical RCU's dynticks interface.
-"
+}
}
+# Was PaulEMcKenney2011cyclicRCU
@techreport{PaulEMcKenney2008cyclicRCU
,author="Paul E. McKenney"
,title="Efficient Support of Consistent Cyclic Search With Read-Copy Update"
@@ -2008,11 +2054,11 @@ lot of {Linux} into your technology!!!"
,number="US Patent 7,426,511"
,month="September"
,pages="23"
-,annotation="
+,annotation={
Maintains an additional level of indirection to allow
readers to confine themselves to the desired snapshot of the
data structure. Only permits one update at a time.
-"
+}
}
@unpublished{PaulEMcKenney2008HierarchicalRCU
@@ -2021,13 +2067,12 @@ lot of {Linux} into your technology!!!"
,month="November"
,day="3"
,year="2008"
-,note="Available:
-\url{http://lwn.net/Articles/305782/}
-[Viewed November 6, 2008]"
-,annotation="
+,note="\url{http://lwn.net/Articles/305782/}"
+,annotation={
RCU with combining-tree-based grace-period detection,
permitting it to handle thousands of CPUs.
-"
+ [Viewed November 6, 2008]
+}
}
@unpublished{PaulEMcKenney2009BloatwatchRCU
@@ -2039,10 +2084,10 @@ lot of {Linux} into your technology!!!"
,note="Available:
\url{http://lkml.org/lkml/2009/1/14/449}
[Viewed January 15, 2009]"
-,annotation="
+,annotation={
Small-footprint implementation of RCU for uniprocessor
embedded applications -- and also for exposition purposes.
-"
+}
}
@conference{PaulEMcKenney2009MaliciousURCU
@@ -2055,9 +2100,9 @@ lot of {Linux} into your technology!!!"
,note="Available:
\url{http://www.rdrop.com/users/paulmck/RCU/urcutorture.2009.01.22a.pdf}
[Viewed February 2, 2009]"
-,annotation="
+,annotation={
Realtime RCU and torture-testing RCU uses.
-"
+}
}
@unpublished{MathieuDesnoyers2009URCU
@@ -2066,16 +2111,14 @@ lot of {Linux} into your technology!!!"
,month="February"
,day="5"
,year="2009"
-,note="Available:
-\url{http://lkml.org/lkml/2009/2/5/572}
-\url{http://lttng.org/urcu}
-[Viewed February 20, 2009]"
-,annotation="
+,note="\url{http://lttng.org/urcu}"
+,annotation={
Mathieu Desnoyers's user-space RCU implementation.
git://lttng.org/userspace-rcu.git
http://lttng.org/cgi-bin/gitweb.cgi?p=userspace-rcu.git
http://lttng.org/urcu
-"
+ http://lkml.org/lkml/2009/2/5/572
+}
}
@unpublished{PaulEMcKenney2009LWNBloatWatchRCU
@@ -2087,9 +2130,24 @@ lot of {Linux} into your technology!!!"
,note="Available:
\url{http://lwn.net/Articles/323929/}
[Viewed March 20, 2009]"
-,annotation="
+,annotation={
Uniprocessor assumptions allow simplified RCU implementation.
-"
+}
+}
+
+@unpublished{EvgeniyPolyakov2009EllipticsNetwork
+,Author="Evgeniy Polyakov"
+,Title="The Elliptics Network"
+,month="April"
+,day="17"
+,year="2009"
+,note="Available:
+\url{http://www.ioremap.net/projects/elliptics}
+[Viewed April 30, 2009]"
+,annotation={
+ Distributed hash table with transactions, using elliptic
+ hash functions to distribute data.
+}
}
@unpublished{PaulEMcKenney2009expeditedRCU
@@ -2101,9 +2159,9 @@ lot of {Linux} into your technology!!!"
,note="Available:
\url{http://lkml.org/lkml/2009/6/25/306}
[Viewed August 16, 2009]"
-,annotation="
+,annotation={
First posting of expedited RCU to be accepted into -tip.
-"
+}
}
@unpublished{PaulEMcKenney2009fastRTRCU
@@ -2115,21 +2173,21 @@ lot of {Linux} into your technology!!!"
,note="Available:
\url{http://lkml.org/lkml/2009/7/23/294}
[Viewed August 15, 2009]"
-,annotation="
+,annotation={
First posting of simple and fast preemptable RCU.
-"
+}
}
-@InProceedings{JoshTriplett2009RPHash
+@unpublished{JoshTriplett2009RPHash
,Author="Josh Triplett"
,Title="Scalable concurrent hash tables via relativistic programming"
,month="September"
,year="2009"
-,booktitle="Linux Plumbers Conference 2009"
-,annotation="
+,note="Linux Plumbers Conference presentation"
+,annotation={
RP fun with hash tables.
- See also JoshTriplett2010RPHash
-"
+ Superseded by JoshTriplett2010RPHash
+}
}
@phdthesis{MathieuDesnoyersPhD
@@ -2154,9 +2212,9 @@ lot of {Linux} into your technology!!!"
,note="Available:
\url{http://wiki.cs.pdx.edu/rp/}
[Viewed December 9, 2009]"
-,annotation="
+,annotation={
Main Relativistic Programming Wiki.
-"
+}
}
@conference{PaulEMcKenney2009DeterministicRCU
@@ -2180,9 +2238,9 @@ lot of {Linux} into your technology!!!"
,note="Available:
\url{http://paulmck.livejournal.com/14639.html}
[Viewed June 4, 2010]"
-,annotation="
+,annotation={
Day-one bug in Tree RCU that took forever to track down.
-"
+}
}
@unpublished{MathieuDesnoyers2009defer:rcu
@@ -2193,10 +2251,10 @@ lot of {Linux} into your technology!!!"
,note="Available:
\url{http://lkml.org/lkml/2009/10/18/129}
[Viewed December 29, 2009]"
-,annotation="
+,annotation={
Mathieu proposed defer_rcu() with fixed-size per-thread pool
of RCU callbacks.
-"
+}
}
@unpublished{MathieuDesnoyers2009VerifPrePub
@@ -2205,10 +2263,10 @@ lot of {Linux} into your technology!!!"
,month="December"
,year="2009"
,note="Submitted to IEEE TPDS"
-,annotation="
+,annotation={
OOMem model for Mathieu's user-level RCU mechanical proof of
correctness.
-"
+}
}
@unpublished{MathieuDesnoyers2009URCUPrePub
@@ -2216,15 +2274,15 @@ lot of {Linux} into your technology!!!"
,Title="User-Level Implementations of Read-Copy Update"
,month="December"
,year="2010"
-,url=\url{http://www.computer.org/csdl/trans/td/2012/02/ttd2012020375-abs.html}
-,annotation="
+,url={\url{http://www.computer.org/csdl/trans/td/2012/02/ttd2012020375-abs.html}}
+,annotation={
RCU overview, desiderata, semi-formal semantics, user-level RCU
usage scenarios, three classes of RCU implementation, wait-free
RCU updates, RCU grace-period batching, update overhead,
http://www.rdrop.com/users/paulmck/RCU/urcu-main-accepted.2011.08.30a.pdf
http://www.rdrop.com/users/paulmck/RCU/urcu-supp-accepted.2011.08.30a.pdf
Superseded by MathieuDesnoyers2012URCU.
-"
+}
}
@inproceedings{HariKannan2009DynamicAnalysisRCU
@@ -2240,7 +2298,8 @@ lot of {Linux} into your technology!!!"
,address = {New York, NY, USA}
,annotation={
Uses RCU to protect metadata used in dynamic analysis.
-}}
+}
+}
@conference{PaulEMcKenney2010SimpleOptRCU
,Author="Paul E. McKenney"
@@ -2252,10 +2311,10 @@ lot of {Linux} into your technology!!!"
,note="Available:
\url{http://www.rdrop.com/users/paulmck/RCU/SimplicityThruOptimization.2010.01.21f.pdf}
[Viewed October 10, 2010]"
-,annotation="
+,annotation={
TREE_PREEMPT_RCU optimizations greatly simplified the old
PREEMPT_RCU implementation.
-"
+}
}
@unpublished{PaulEMcKenney2010LockdepRCU
@@ -2264,12 +2323,11 @@ lot of {Linux} into your technology!!!"
,month="February"
,year="2010"
,day="1"
-,note="Available:
-\url{https://lwn.net/Articles/371986/}
-[Viewed June 4, 2010]"
-,annotation="
+,note="\url{https://lwn.net/Articles/371986/}"
+,annotation={
CONFIG_PROVE_RCU, or at least an early version.
-"
+ [Viewed June 4, 2010]
+}
}
@unpublished{AviKivity2010KVM2RCU
@@ -2280,10 +2338,10 @@ lot of {Linux} into your technology!!!"
,note="Available:
\url{http://www.mail-archive.com/kvm@vger.kernel.org/msg28640.html}
[Viewed March 20, 2010]"
-,annotation="
+,annotation={
Use of RCU permits KVM to increase the size of guest OSes from
16 CPUs to 64 CPUs.
-"
+}
}
@unpublished{HerbertXu2010RCUResizeHash
@@ -2297,7 +2355,19 @@ lot of {Linux} into your technology!!!"
,annotation={
Use a pair of list_head structures to support RCU-protected
resizable hash tables.
-}}
+}
+}
+
+@mastersthesis{AbhinavDuggal2010Masters
+,author="Abhinav Duggal"
+,title="Stopping Data Races Using Redflag"
+,school="Stony Brook University"
+,year="2010"
+,annotation={
+ Data-race detector incorporating RCU.
+ http://www.filesystems.org/docs/abhinav-thesis/abhinav_thesis.pdf
+}
+}
@article{JoshTriplett2010RPHash
,author="Josh Triplett and Paul E. McKenney and Jonathan Walpole"
@@ -2310,7 +2380,8 @@ lot of {Linux} into your technology!!!"
,annotation={
RP fun with hash tables.
http://portal.acm.org/citation.cfm?id=1842733.1842750
-}}
+}
+}
@unpublished{PaulEMcKenney2010RCUAPI
,Author="Paul E. McKenney"
@@ -2318,12 +2389,11 @@ lot of {Linux} into your technology!!!"
,month="December"
,day="8"
,year="2010"
-,note="Available:
-\url{http://lwn.net/Articles/418853/}
-[Viewed December 8, 2010]"
-,annotation="
+,note="\url{http://lwn.net/Articles/418853/}"
+,annotation={
Includes updated software-engineering features.
-"
+ [Viewed December 8, 2010]
+}
}
@mastersthesis{AndrejPodzimek2010masters
@@ -2338,7 +2408,8 @@ lot of {Linux} into your technology!!!"
Reviews RCU implementations and creates a few for OpenSolaris.
Drives quiescent-state detection from RCU read-side primitives,
in a manner roughly similar to that of Jim Houston.
-}}
+}
+}
@unpublished{LinusTorvalds2011Linux2:6:38:rc1:NPigginVFS
,Author="Linus Torvalds"
@@ -2358,7 +2429,8 @@ lot of {Linux} into your technology!!!"
of the most expensive parts of path component lookup, which was the
d_lock on every component lookup. So I'm seeing improvements of 30-50%
on some seriously pathname-lookup intensive loads."
-}}
+}
+}
@techreport{JoshTriplett2011RPScalableCorrectOrdering
,author = {Josh Triplett and Philip W. Howard and Paul E. McKenney and Jonathan Walpole}
@@ -2392,12 +2464,12 @@ lot of {Linux} into your technology!!!"
,number="US Patent 7,953,778"
,month="May"
,pages="34"
-,annotation="
+,annotation={
Maintains an array of generation numbers to track in-flight
updates and keeps an additional level of indirection to allow
readers to confine themselves to the desired snapshot of the
data structure.
-"
+}
}
@inproceedings{Triplett:2011:RPHash
@@ -2408,7 +2480,7 @@ lot of {Linux} into your technology!!!"
,year = {2011}
,pages = {145--158}
,numpages = {14}
-,url={http://www.usenix.org/event/atc11/tech/final_files/atc11_proceedings.pdf}
+,url={http://www.usenix.org/event/atc11/tech/final_files/Triplett.pdf}
,publisher = {The USENIX Association}
,address = {Portland, OR USA}
}
@@ -2419,27 +2491,58 @@ lot of {Linux} into your technology!!!"
,month="July"
,day="27"
,year="2011"
-,note="Available:
-\url{http://lwn.net/Articles/453002/}
-[Viewed July 27, 2011]"
-,annotation="
+,note="\url{http://lwn.net/Articles/453002/}"
+,annotation={
Analysis of the RCU trainwreck in Linux kernel 3.0.
-"
+ [Viewed July 27, 2011]
+}
}
@unpublished{NeilBrown2011MeetTheLockers
,Author="Neil Brown"
-,Title="Meet the Lockers"
+,Title="Meet the {Lockers}"
,month="August"
,day="3"
,year="2011"
,note="Available:
\url{http://lwn.net/Articles/453685/}
[Viewed September 2, 2011]"
-,annotation="
+,annotation={
The Locker family as an analogy for locking, reference counting,
RCU, and seqlock.
-"
+}
+}
+
+@inproceedings{Seyster:2011:RFA:2075416.2075425
+,author = {Seyster, Justin and Radhakrishnan, Prabakar and Katoch, Samriti and Duggal, Abhinav and Stoller, Scott D. and Zadok, Erez}
+,title = {Redflag: a framework for analysis of Kernel-level concurrency}
+,booktitle = {Proceedings of the 11th international conference on Algorithms and architectures for parallel processing - Volume Part I}
+,series = {ICA3PP'11}
+,year = {2011}
+,isbn = {978-3-642-24649-4}
+,location = {Melbourne, Australia}
+,pages = {66--79}
+,numpages = {14}
+,url = {http://dl.acm.org/citation.cfm?id=2075416.2075425}
+,acmid = {2075425}
+,publisher = {Springer-Verlag}
+,address = {Berlin, Heidelberg}
+}
+
+@phdthesis{JoshTriplettPhD
+,author="Josh Triplett"
+,title="Relativistic Causal Ordering: A Memory Model for Scalable Concurrent Data Structures"
+,school="Portland State University"
+,year="2012"
+,annotation={
+ RCU-protected hash tables, barriers vs. read-side traversal order.
+ .
+ If the updater is making changes in the opposite direction from
+ the read-side traveral order, the updater need only execute a
+ memory-barrier instruction, but if in the same direction, the
+ updater needs to wait for a grace period between the individual
+ updates.
+}
}
@article{MathieuDesnoyers2012URCU
@@ -2459,5 +2562,150 @@ lot of {Linux} into your technology!!!"
RCU updates, RCU grace-period batching, update overhead,
http://www.rdrop.com/users/paulmck/RCU/urcu-main-accepted.2011.08.30a.pdf
http://www.rdrop.com/users/paulmck/RCU/urcu-supp-accepted.2011.08.30a.pdf
+ http://www.computer.org/cms/Computer.org/dl/trans/td/2012/02/extras/ttd2012020375s.pdf
+}
+}
+
+@inproceedings{AustinClements2012RCULinux:mmapsem
+,author = {Austin Clements and Frans Kaashoek and Nickolai Zeldovich}
+,title = {Scalable Address Spaces Using {RCU} Balanced Trees}
+,booktitle = {Architectural Support for Programming Languages and Operating Systems (ASPLOS 2012)}
+,month = {March}
+,year = {2012}
+,pages = {199--210}
+,numpages = {12}
+,publisher = {ACM}
+,address = {London, UK}
+,url="http://people.csail.mit.edu/nickolai/papers/clements-bonsai.pdf"
+}
+
+@unpublished{PaulEMcKenney2012ELCbattery
+,Author="Paul E. McKenney"
+,Title="Making {RCU} Safe For Battery-Powered Devices"
+,month="February"
+,day="15"
+,year="2012"
+,note="Available:
+\url{http://www.rdrop.com/users/paulmck/RCU/RCUdynticks.2012.02.15b.pdf}
+[Viewed March 1, 2012]"
+,annotation={
+ RCU_FAST_NO_HZ, round 2.
+}
+}
+
+@article{GuillermoVigueras2012RCUCrowd
+,author = {Vigueras, Guillermo and Ordu\~{n}a, Juan M. and Lozano, Miguel}
+,day = {25}
+,doi = {10.1007/s11227-012-0766-x}
+,issn = {0920-8542}
+,journal = {The Journal of Supercomputing}
+,keywords = {linux, simulation}
+,month = apr
+,posted-at = {2012-05-03 09:12:04}
+,priority = {2}
+,title = {{A Read-Copy Update based parallel server for distributed crowd simulations}}
+,url = {http://dx.doi.org/10.1007/s11227-012-0766-x}
+,year = {2012}
+}
+
+
+@unpublished{JonCorbet2012ACCESS:ONCE
+,Author="Jon Corbet"
+,Title="{ACCESS\_ONCE()}"
+,month="August"
+,day="1"
+,year="2012"
+,note="\url{http://lwn.net/Articles/508991/}"
+,annotation={
+ A couple of simple specific compiler optimizations that motivate
+ ACCESS_ONCE().
+}
+}
+
+@unpublished{AlexeyGotsman2012VerifyGraceExtended
+,Author="Alexey Gotsman and Noam Rinetzky and Hongseok Yang"
+,Title="Verifying Highly Concurrent Algorithms with Grace (extended version)"
+,month="July"
+,day="10"
+,year="2012"
+,note="\url{http://software.imdea.org/~gotsman/papers/recycling-esop13-ext.pdf}"
+,annotation={
+ Separation-logic formulation of RCU uses.
+}
+}
+
+@unpublished{PaulMcKenney2012RCUUsage
+,Author="Paul E. McKenney and Silas Boyd-Wickizer and Jonathan Walpole"
+,Title="{RCU} Usage In the Linux Kernel: One Decade Later"
+,month="September"
+,day="17"
+,year="2012"
+,url=http://rdrop.com/users/paulmck/techreports/survey.2012.09.17a.pdf
+,note="Technical report paulmck.2012.09.17"
+,annotation={
+ Overview of the first variant of no-CBs CPUs for RCU.
+}
+}
+
+@unpublished{JonCorbet2012NOCB
+,Author="Jon Corbet"
+,Title="Relocating RCU callbacks"
+,month="October"
+,day="31"
+,year="2012"
+,note="\url{http://lwn.net/Articles/522262/}"
+,annotation={
+ Overview of the first variant of no-CBs CPUs for RCU.
+}
+}
+
+@phdthesis{JustinSeyster2012PhD
+,author="Justin Seyster"
+,title="Runtime Verification of Kernel-Level Concurrency Using Compiler-Based Instrumentation"
+,school="Stony Brook University"
+,year="2012"
+,annotation={
+ Looking for data races, including those involving RCU.
+ Proposal:
+ http://www.fsl.cs.sunysb.edu/docs/jseyster-proposal/redflag.pdf
+ Dissertation:
+ http://www.fsl.cs.sunysb.edu/docs/jseyster-dissertation/redflag.pdf
+}
+}
+
+@unpublished{PaulEMcKenney2013RCUUsage
+,Author="Paul E. McKenney and Silas Boyd-Wickizer and Jonathan Walpole"
+,Title="{RCU} Usage in the {Linux} Kernel: One Decade Later"
+,month="February"
+,day="24"
+,year="2013"
+,note="\url{http://rdrop.com/users/paulmck/techreports/RCUUsage.2013.02.24a.pdf}"
+,annotation={
+ Usage of RCU within the Linux kernel.
+}
+}
+
+@inproceedings{AlexeyGotsman2013ESOPRCU
+,author = {Alexey Gotsman and Noam Rinetzky and Hongseok Yang}
+,title = {Verifying concurrent memory reclamation algorithms with grace}
+,booktitle = {ESOP'13: European Symposium on Programming}
+,year = {2013}
+,pages = {249--269}
+,publisher = {Springer}
+,address = {Rome, Italy}
+,annotation={
+ http://software.imdea.org/~gotsman/papers/recycling-esop13.pdf
+}
+}
+
+@unpublished{PaulEMcKenney2013NoTinyPreempt
+,Author="Paul E. McKenney"
+,Title="Simplifying RCU"
+,month="March"
+,day="6"
+,year="2013"
+,note="\url{http://lwn.net/Articles/541037/}"
+,annotation={
+ Getting rid of TINY_PREEMPT_RCU.
}
}
diff --git a/Documentation/RCU/rcubarrier.txt b/Documentation/RCU/rcubarrier.txt
index 2e319d1b9ef..b10cfe711e6 100644
--- a/Documentation/RCU/rcubarrier.txt
+++ b/Documentation/RCU/rcubarrier.txt
@@ -70,10 +70,14 @@ in realtime kernels in order to avoid excessive scheduling latencies.
rcu_barrier()
-We instead need the rcu_barrier() primitive. This primitive is similar
-to synchronize_rcu(), but instead of waiting solely for a grace
-period to elapse, it also waits for all outstanding RCU callbacks to
-complete. Pseudo-code using rcu_barrier() is as follows:
+We instead need the rcu_barrier() primitive. Rather than waiting for
+a grace period to elapse, rcu_barrier() waits for all outstanding RCU
+callbacks to complete. Please note that rcu_barrier() does -not- imply
+synchronize_rcu(), in particular, if there are no RCU callbacks queued
+anywhere, rcu_barrier() is within its rights to return immediately,
+without waiting for a grace period to elapse.
+
+Pseudo-code using rcu_barrier() is as follows:
1. Prevent any new RCU callbacks from being posted.
2. Execute rcu_barrier().
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index fa5d8a9ae20..c8c42e64e95 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -531,9 +531,10 @@ dependency barrier to make it work correctly. Consider the following bit of
code:
q = &a;
- if (p)
+ if (p) {
+ <data dependency barrier>
q = &b;
- <data dependency barrier>
+ }
x = *q;
This will not have the desired effect because there is no actual data
@@ -542,9 +543,10 @@ attempting to predict the outcome in advance. In such a case what's actually
required is:
q = &a;
- if (p)
+ if (p) {
+ <read barrier>
q = &b;
- <read barrier>
+ }
x = *q;
diff --git a/Documentation/timers/NO_HZ.txt b/Documentation/timers/NO_HZ.txt
index 88697584242..cca122f2512 100644
--- a/Documentation/timers/NO_HZ.txt
+++ b/Documentation/timers/NO_HZ.txt
@@ -24,8 +24,8 @@ There are three main ways of managing scheduling-clock interrupts
workloads, you will normally -not- want this option.
These three cases are described in the following three sections, followed
-by a third section on RCU-specific considerations and a fourth and final
-section listing known issues.
+by a third section on RCU-specific considerations, a fourth section
+discussing testing, and a fifth and final section listing known issues.
NEVER OMIT SCHEDULING-CLOCK TICKS
@@ -121,14 +121,15 @@ boot parameter specifies the adaptive-ticks CPUs. For example,
"nohz_full=1,6-8" says that CPUs 1, 6, 7, and 8 are to be adaptive-ticks
CPUs. Note that you are prohibited from marking all of the CPUs as
adaptive-tick CPUs: At least one non-adaptive-tick CPU must remain
-online to handle timekeeping tasks in order to ensure that system calls
-like gettimeofday() returns accurate values on adaptive-tick CPUs.
-(This is not an issue for CONFIG_NO_HZ_IDLE=y because there are no
-running user processes to observe slight drifts in clock rate.)
-Therefore, the boot CPU is prohibited from entering adaptive-ticks
-mode. Specifying a "nohz_full=" mask that includes the boot CPU will
-result in a boot-time error message, and the boot CPU will be removed
-from the mask.
+online to handle timekeeping tasks in order to ensure that system
+calls like gettimeofday() returns accurate values on adaptive-tick CPUs.
+(This is not an issue for CONFIG_NO_HZ_IDLE=y because there are no running
+user processes to observe slight drifts in clock rate.) Therefore, the
+boot CPU is prohibited from entering adaptive-ticks mode. Specifying a
+"nohz_full=" mask that includes the boot CPU will result in a boot-time
+error message, and the boot CPU will be removed from the mask. Note that
+this means that your system must have at least two CPUs in order for
+CONFIG_NO_HZ_FULL=y to do anything for you.
Alternatively, the CONFIG_NO_HZ_FULL_ALL=y Kconfig parameter specifies
that all CPUs other than the boot CPU are adaptive-ticks CPUs. This
@@ -232,6 +233,29 @@ scheduler will decide where to run them, which might or might not be
where you want them to run.
+TESTING
+
+So you enable all the OS-jitter features described in this document,
+but do not see any change in your workload's behavior. Is this because
+your workload isn't affected that much by OS jitter, or is it because
+something else is in the way? This section helps answer this question
+by providing a simple OS-jitter test suite, which is available on branch
+master of the following git archive:
+
+git://git.kernel.org/pub/scm/linux/kernel/git/frederic/dynticks-testing.git
+
+Clone this archive and follow the instructions in the README file.
+This test procedure will produce a trace that will allow you to evaluate
+whether or not you have succeeded in removing OS jitter from your system.
+If this trace shows that you have removed OS jitter as much as is
+possible, then you can conclude that your workload is not all that
+sensitive to OS jitter.
+
+Note: this test requires that your system have at least two CPUs.
+We do not currently have a good way to remove OS jitter from single-CPU
+systems.
+
+
KNOWN ISSUES
o Dyntick-idle slows transitions to and from idle slightly.
diff --git a/include/linux/debugobjects.h b/include/linux/debugobjects.h
index 0e5f5785d9f..98ffcbd4888 100644
--- a/include/linux/debugobjects.h
+++ b/include/linux/debugobjects.h
@@ -63,7 +63,7 @@ struct debug_obj_descr {
extern void debug_object_init (void *addr, struct debug_obj_descr *descr);
extern void
debug_object_init_on_stack(void *addr, struct debug_obj_descr *descr);
-extern void debug_object_activate (void *addr, struct debug_obj_descr *descr);
+extern int debug_object_activate (void *addr, struct debug_obj_descr *descr);
extern void debug_object_deactivate(void *addr, struct debug_obj_descr *descr);
extern void debug_object_destroy (void *addr, struct debug_obj_descr *descr);
extern void debug_object_free (void *addr, struct debug_obj_descr *descr);
@@ -85,8 +85,8 @@ static inline void
debug_object_init (void *addr, struct debug_obj_descr *descr) { }
static inline void
debug_object_init_on_stack(void *addr, struct debug_obj_descr *descr) { }
-static inline void
-debug_object_activate (void *addr, struct debug_obj_descr *descr) { }
+static inline int
+debug_object_activate (void *addr, struct debug_obj_descr *descr) { return 0; }
static inline void
debug_object_deactivate(void *addr, struct debug_obj_descr *descr) { }
static inline void
diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h
index 97ba4e78a37..d235e88cfd7 100644
--- a/include/linux/jiffies.h
+++ b/include/linux/jiffies.h
@@ -101,13 +101,13 @@ static inline u64 get_jiffies_64(void)
#define time_after(a,b) \
(typecheck(unsigned long, a) && \
typecheck(unsigned long, b) && \
- ((long)(b) - (long)(a) < 0))
+ ((long)((b) - (a)) < 0))
#define time_before(a,b) time_after(b,a)
#define time_after_eq(a,b) \
(typecheck(unsigned long, a) && \
typecheck(unsigned long, b) && \
- ((long)(a) - (long)(b) >= 0))
+ ((long)((a) - (b)) >= 0))
#define time_before_eq(a,b) time_after_eq(b,a)
/*
@@ -130,13 +130,13 @@ static inline u64 get_jiffies_64(void)
#define time_after64(a,b) \
(typecheck(__u64, a) && \
typecheck(__u64, b) && \
- ((__s64)(b) - (__s64)(a) < 0))
+ ((__s64)((b) - (a)) < 0))
#define time_before64(a,b) time_after64(b,a)
#define time_after_eq64(a,b) \
(typecheck(__u64, a) && \
typecheck(__u64, b) && \
- ((__s64)(a) - (__s64)(b) >= 0))
+ ((__s64)((a) - (b)) >= 0))
#define time_before_eq64(a,b) time_after_eq64(b,a)
#define time_in_range64(a, b, c) \
diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index f4b1001a467..4106721c4e5 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -267,8 +267,9 @@ static inline void list_splice_init_rcu(struct list_head *list,
*/
#define list_first_or_null_rcu(ptr, type, member) \
({struct list_head *__ptr = (ptr); \
- struct list_head __rcu *__next = list_next_rcu(__ptr); \
- likely(__ptr != __next) ? container_of(__next, type, member) : NULL; \
+ struct list_head *__next = ACCESS_ONCE(__ptr->next); \
+ likely(__ptr != __next) ? \
+ list_entry_rcu(__next, type, member) : NULL; \
})
/**
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 0c38abbe6e3..f1f1bc39346 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -229,13 +229,9 @@ extern void rcu_irq_exit(void);
#ifdef CONFIG_RCU_USER_QS
extern void rcu_user_enter(void);
extern void rcu_user_exit(void);
-extern void rcu_user_enter_after_irq(void);
-extern void rcu_user_exit_after_irq(void);
#else
static inline void rcu_user_enter(void) { }
static inline void rcu_user_exit(void) { }
-static inline void rcu_user_enter_after_irq(void) { }
-static inline void rcu_user_exit_after_irq(void) { }
static inline void rcu_user_hooks_switch(struct task_struct *prev,
struct task_struct *next) { }
#endif /* CONFIG_RCU_USER_QS */
@@ -1015,4 +1011,22 @@ static inline bool rcu_is_nocb_cpu(int cpu) { return false; }
#endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */
+/* Only for use by adaptive-ticks code. */
+#ifdef CONFIG_NO_HZ_FULL_SYSIDLE
+extern bool rcu_sys_is_idle(void);
+extern void rcu_sysidle_force_exit(void);
+#else /* #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
+
+static inline bool rcu_sys_is_idle(void)
+{
+ return false;
+}
+
+static inline void rcu_sysidle_force_exit(void)
+{
+}
+
+#endif /* #else #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
+
+
#endif /* __LINUX_RCUPDATE_H */
diff --git a/init/Kconfig b/init/Kconfig
index 247084be059..c08a5495dbb 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -470,6 +470,7 @@ config TREE_RCU
config TREE_PREEMPT_RCU
bool "Preemptible tree-based hierarchical RCU"
depends on PREEMPT
+ select IRQ_WORK
help
This option selects the RCU implementation that is
designed for very large SMP systems with hundreds or
diff --git a/kernel/rcu.h b/kernel/rcu.h
index 0a90ccc65bf..77131966c4a 100644
--- a/kernel/rcu.h
+++ b/kernel/rcu.h
@@ -67,12 +67,15 @@
extern struct debug_obj_descr rcuhead_debug_descr;
-static inline void debug_rcu_head_queue(struct rcu_head *head)
+static inline int debug_rcu_head_queue(struct rcu_head *head)
{
- debug_object_activate(head, &rcuhead_debug_descr);
+ int r1;
+
+ r1 = debug_object_activate(head, &rcuhead_debug_descr);
debug_object_active_state(head, &rcuhead_debug_descr,
STATE_RCU_HEAD_READY,
STATE_RCU_HEAD_QUEUED);
+ return r1;
}
static inline void debug_rcu_head_unqueue(struct rcu_head *head)
@@ -83,8 +86,9 @@ static inline void debug_rcu_head_unqueue(struct rcu_head *head)
debug_object_deactivate(head, &rcuhead_debug_descr);
}
#else /* !CONFIG_DEBUG_OBJECTS_RCU_HEAD */
-static inline void debug_rcu_head_queue(struct rcu_head *head)
+static inline int debug_rcu_head_queue(struct rcu_head *head)
{
+ return 0;
}
static inline void debug_rcu_head_unqueue(struct rcu_head *head)
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index 14994d4e1a5..33eb4620aa1 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -212,43 +212,6 @@ static inline void debug_rcu_head_free(struct rcu_head *head)
}
/*
- * fixup_init is called when:
- * - an active object is initialized
- */
-static int rcuhead_fixup_init(void *addr, enum debug_obj_state state)
-{
- struct rcu_head *head = addr;
-
- switch (state) {
- case ODEBUG_STATE_ACTIVE:
- /*
- * Ensure that queued callbacks are all executed.
- * If we detect that we are nested in a RCU read-side critical
- * section, we should simply fail, otherwise we would deadlock.
- * In !PREEMPT configurations, there is no way to tell if we are
- * in a RCU read-side critical section or not, so we never
- * attempt any fixup and just print a warning.
- */
-#ifndef CONFIG_PREEMPT
- WARN_ON_ONCE(1);
- return 0;
-#endif
- if (rcu_preempt_depth() != 0 || preempt_count() != 0 ||
- irqs_disabled()) {
- WARN_ON_ONCE(1);
- return 0;
- }
- rcu_barrier();
- rcu_barrier_sched();
- rcu_barrier_bh();
- debug_object_init(head, &rcuhead_debug_descr);
- return 1;
- default:
- return 0;
- }
-}
-
-/*
* fixup_activate is called when:
* - an active object is activated
* - an unknown object is activated (might be a statically initialized object)
@@ -268,69 +231,8 @@ static int rcuhead_fixup_activate(void *addr, enum debug_obj_state state)
debug_object_init(head, &rcuhead_debug_descr);
debug_object_activate(head, &rcuhead_debug_descr);
return 0;
-
- case ODEBUG_STATE_ACTIVE:
- /*
- * Ensure that queued callbacks are all executed.
- * If we detect that we are nested in a RCU read-side critical
- * section, we should simply fail, otherwise we would deadlock.
- * In !PREEMPT configurations, there is no way to tell if we are
- * in a RCU read-side critical section or not, so we never
- * attempt any fixup and just print a warning.
- */
-#ifndef CONFIG_PREEMPT
- WARN_ON_ONCE(1);
- return 0;
-#endif
- if (rcu_preempt_depth() != 0 || preempt_count() != 0 ||
- irqs_disabled()) {
- WARN_ON_ONCE(1);
- return 0;
- }
- rcu_barrier();
- rcu_barrier_sched();
- rcu_barrier_bh();
- debug_object_activate(head, &rcuhead_debug_descr);
- return 1;
default:
- return 0;
- }
-}
-
-/*
- * fixup_free is called when:
- * - an active object is freed
- */
-static int rcuhead_fixup_free(void *addr, enum debug_obj_state state)
-{
- struct rcu_head *head = addr;
-
- switch (state) {
- case ODEBUG_STATE_ACTIVE:
- /*
- * Ensure that queued callbacks are all executed.
- * If we detect that we are nested in a RCU read-side critical
- * section, we should simply fail, otherwise we would deadlock.
- * In !PREEMPT configurations, there is no way to tell if we are
- * in a RCU read-side critical section or not, so we never
- * attempt any fixup and just print a warning.
- */
-#ifndef CONFIG_PREEMPT
- WARN_ON_ONCE(1);
- return 0;
-#endif
- if (rcu_preempt_depth() != 0 || preempt_count() != 0 ||
- irqs_disabled()) {
- WARN_ON_ONCE(1);
- return 0;
- }
- rcu_barrier();
- rcu_barrier_sched();
- rcu_barrier_bh();
- debug_object_free(head, &rcuhead_debug_descr);
return 1;
- default:
- return 0;
}
}
@@ -369,9 +271,7 @@ EXPORT_SYMBOL_GPL(destroy_rcu_head_on_stack);
struct debug_obj_descr rcuhead_debug_descr = {
.name = "rcu_head",
- .fixup_init = rcuhead_fixup_init,
.fixup_activate = rcuhead_fixup_activate,
- .fixup_free = rcuhead_fixup_free,
};
EXPORT_SYMBOL_GPL(rcuhead_debug_descr);
#endif /* #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD */
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 338f1d1c1c6..32618b3fe4e 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -54,6 +54,7 @@
#include <linux/stop_machine.h>
#include <linux/random.h>
#include <linux/ftrace_event.h>
+#include <linux/suspend.h>
#include "rcutree.h"
#include <trace/events/rcu.h>
@@ -224,6 +225,10 @@ EXPORT_SYMBOL_GPL(rcu_note_context_switch);
DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
.dynticks_nesting = DYNTICK_TASK_EXIT_IDLE,
.dynticks = ATOMIC_INIT(1),
+#ifdef CONFIG_NO_HZ_FULL_SYSIDLE
+ .dynticks_idle_nesting = DYNTICK_TASK_NEST_VALUE,
+ .dynticks_idle = ATOMIC_INIT(1),
+#endif /* #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
};
static long blimit = 10; /* Maximum callbacks per rcu_do_batch. */
@@ -242,7 +247,10 @@ module_param(jiffies_till_next_fqs, ulong, 0644);
static void rcu_start_gp_advanced(struct rcu_state *rsp, struct rcu_node *rnp,
struct rcu_data *rdp);
-static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *));
+static void force_qs_rnp(struct rcu_state *rsp,
+ int (*f)(struct rcu_data *rsp, bool *isidle,
+ unsigned long *maxj),
+ bool *isidle, unsigned long *maxj);
static void force_quiescent_state(struct rcu_state *rsp);
static int rcu_pending(int cpu);
@@ -427,6 +435,7 @@ void rcu_idle_enter(void)
local_irq_save(flags);
rcu_eqs_enter(false);
+ rcu_sysidle_enter(&__get_cpu_var(rcu_dynticks), 0);
local_irq_restore(flags);
}
EXPORT_SYMBOL_GPL(rcu_idle_enter);
@@ -444,27 +453,6 @@ void rcu_user_enter(void)
{
rcu_eqs_enter(1);
}
-
-/**
- * rcu_user_enter_after_irq - inform RCU that we are going to resume userspace
- * after the current irq returns.
- *
- * This is similar to rcu_user_enter() but in the context of a non-nesting
- * irq. After this call, RCU enters into idle mode when the interrupt
- * returns.
- */
-void rcu_user_enter_after_irq(void)
-{
- unsigned long flags;
- struct rcu_dynticks *rdtp;
-
- local_irq_save(flags);
- rdtp = &__get_cpu_var(rcu_dynticks);
- /* Ensure this irq is interrupting a non-idle RCU state. */
- WARN_ON_ONCE(!(rdtp->dynticks_nesting & DYNTICK_TASK_MASK));
- rdtp->dynticks_nesting = 1;
- local_irq_restore(flags);
-}
#endif /* CONFIG_RCU_USER_QS */
/**
@@ -498,6 +486,7 @@ void rcu_irq_exit(void)
trace_rcu_dyntick(TPS("--="), oldval, rdtp->dynticks_nesting);
else
rcu_eqs_enter_common(rdtp, oldval, true);
+ rcu_sysidle_enter(rdtp, 1);
local_irq_restore(flags);
}
@@ -566,6 +555,7 @@ void rcu_idle_exit(void)
local_irq_save(flags);
rcu_eqs_exit(false);
+ rcu_sysidle_exit(&__get_cpu_var(rcu_dynticks), 0);
local_irq_restore(flags);
}
EXPORT_SYMBOL_GPL(rcu_idle_exit);
@@ -581,28 +571,6 @@ void rcu_user_exit(void)
{
rcu_eqs_exit(1);
}
-
-/**
- * rcu_user_exit_after_irq - inform RCU that we won't resume to userspace
- * idle mode after the current non-nesting irq returns.
- *
- * This is similar to rcu_user_exit() but in the context of an irq.
- * This is called when the irq has interrupted a userspace RCU idle mode
- * context. When the current non-nesting interrupt returns after this call,
- * the CPU won't restore the RCU idle mode.
- */
-void rcu_user_exit_after_irq(void)
-{
- unsigned long flags;
- struct rcu_dynticks *rdtp;
-
- local_irq_save(flags);
- rdtp = &__get_cpu_var(rcu_dynticks);
- /* Ensure we are interrupting an RCU idle mode. */
- WARN_ON_ONCE(rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK);
- rdtp->dynticks_nesting += DYNTICK_TASK_EXIT_IDLE;
- local_irq_restore(flags);
-}
#endif /* CONFIG_RCU_USER_QS */
/**
@@ -639,6 +607,7 @@ void rcu_irq_enter(void)
trace_rcu_dyntick(TPS("++="), oldval, rdtp->dynticks_nesting);
else
rcu_eqs_exit_common(rdtp, oldval, true);
+ rcu_sysidle_exit(rdtp, 1);
local_irq_restore(flags);
}
@@ -762,9 +731,11 @@ static int rcu_is_cpu_rrupt_from_idle(void)
* credit them with an implicit quiescent state. Return 1 if this CPU
* is in dynticks idle mode, which is an extended quiescent state.
*/
-static int dyntick_save_progress_counter(struct rcu_data *rdp)
+static int dyntick_save_progress_counter(struct rcu_data *rdp,
+ bool *isidle, unsigned long *maxj)
{
rdp->dynticks_snap = atomic_add_return(0, &rdp->dynticks->dynticks);
+ rcu_sysidle_check_cpu(rdp, isidle, maxj);
return (rdp->dynticks_snap & 0x1) == 0;
}
@@ -774,7 +745,8 @@ static int dyntick_save_progress_counter(struct rcu_data *rdp)
* idle state since the last call to dyntick_save_progress_counter()
* for this same CPU, or by virtue of having been offline.
*/
-static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
+static int rcu_implicit_dynticks_qs(struct rcu_data *rdp,
+ bool *isidle, unsigned long *maxj)
{
unsigned int curr;
unsigned int snap;
@@ -1332,6 +1304,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
struct rcu_data *rdp;
struct rcu_node *rnp = rcu_get_root(rsp);
+ rcu_bind_gp_kthread();
raw_spin_lock_irq(&rnp->lock);
rsp->gp_flags = 0; /* Clear all flags: New grace period. */
@@ -1396,16 +1369,25 @@ static int rcu_gp_init(struct rcu_state *rsp)
int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
{
int fqs_state = fqs_state_in;
+ bool isidle = false;
+ unsigned long maxj;
struct rcu_node *rnp = rcu_get_root(rsp);
rsp->n_force_qs++;
if (fqs_state == RCU_SAVE_DYNTICK) {
/* Collect dyntick-idle snapshots. */
- force_qs_rnp(rsp, dyntick_save_progress_counter);
+ if (is_sysidle_rcu_state(rsp)) {
+ isidle = 1;
+ maxj = jiffies - ULONG_MAX / 4;
+ }
+ force_qs_rnp(rsp, dyntick_save_progress_counter,
+ &isidle, &maxj);
+ rcu_sysidle_report_gp(rsp, isidle, maxj);
fqs_state = RCU_FORCE_QS;
} else {
/* Handle dyntick-idle and offline CPUs. */
- force_qs_rnp(rsp, rcu_implicit_dynticks_qs);
+ isidle = 0;
+ force_qs_rnp(rsp, rcu_implicit_dynticks_qs, &isidle, &maxj);
}
/* Clear flag to prevent immediate re-entry. */
if (ACCESS_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) {
@@ -1575,10 +1557,12 @@ rcu_start_gp_advanced(struct rcu_state *rsp, struct rcu_node *rnp,
/*
* We can't do wakeups while holding the rnp->lock, as that
- * could cause possible deadlocks with the rq->lock. Deter
- * the wakeup to interrupt context.
+ * could cause possible deadlocks with the rq->lock. Defer
+ * the wakeup to interrupt context. And don't bother waking
+ * up the running kthread.
*/
- irq_work_queue(&rsp->wakeup_work);
+ if (current != rsp->gp_kthread)
+ irq_work_queue(&rsp->wakeup_work);
}
/*
@@ -2104,7 +2088,10 @@ void rcu_check_callbacks(int cpu, int user)
*
* The caller must have suppressed start of new grace periods.
*/
-static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *))
+static void force_qs_rnp(struct rcu_state *rsp,
+ int (*f)(struct rcu_data *rsp, bool *isidle,
+ unsigned long *maxj),
+ bool *isidle, unsigned long *maxj)
{
unsigned long bit;
int cpu;
@@ -2127,9 +2114,12 @@ static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *))
cpu = rnp->grplo;
bit = 1;
for (; cpu <= rnp->grphi; cpu++, bit <<= 1) {
- if ((rnp->qsmask & bit) != 0 &&
- f(per_cpu_ptr(rsp->rda, cpu)))
- mask |= bit;
+ if ((rnp->qsmask & bit) != 0) {
+ if ((rnp->qsmaskinit & bit) != 0)
+ *isidle = 0;
+ if (f(per_cpu_ptr(rsp->rda, cpu), isidle, maxj))
+ mask |= bit;
+ }
}
if (mask != 0) {
@@ -2304,6 +2294,13 @@ static void __call_rcu_core(struct rcu_state *rsp, struct rcu_data *rdp,
}
/*
+ * RCU callback function to leak a callback.
+ */
+static void rcu_leak_callback(struct rcu_head *rhp)
+{
+}
+
+/*
* Helper function for call_rcu() and friends. The cpu argument will
* normally be -1, indicating "currently running CPU". It may specify
* a CPU only if that CPU is a no-CBs CPU. Currently, only _rcu_barrier()
@@ -2317,7 +2314,12 @@ __call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu),
struct rcu_data *rdp;
WARN_ON_ONCE((unsigned long)head & 0x3); /* Misaligned rcu_head! */
- debug_rcu_head_queue(head);
+ if (debug_rcu_head_queue(head)) {
+ /* Probable double call_rcu(), so leak the callback. */
+ ACCESS_ONCE(head->func) = rcu_leak_callback;
+ WARN_ONCE(1, "__call_rcu(): Leaked duplicate callback\n");
+ return;
+ }
head->func = func;
head->next = NULL;
@@ -2802,9 +2804,20 @@ static void _rcu_barrier(struct rcu_state *rsp)
* transition. The "if" expression below therefore rounds the old
* value up to the next even number and adds two before comparing.
*/
- snap_done = ACCESS_ONCE(rsp->n_barrier_done);
+ snap_done = rsp->n_barrier_done;
_rcu_barrier_trace(rsp, "Check", -1, snap_done);
- if (ULONG_CMP_GE(snap_done, ((snap + 1) & ~0x1) + 2)) {
+
+ /*
+ * If the value in snap is odd, we needed to wait for the current
+ * rcu_barrier() to complete, then wait for the next one, in other
+ * words, we need the value of snap_done to be three larger than
+ * the value of snap. On the other hand, if the value in snap is
+ * even, we only had to wait for the next rcu_barrier() to complete,
+ * in other words, we need the value of snap_done to be only two
+ * greater than the value of snap. The "(snap + 3) & ~0x1" computes
+ * this for us (thank you, Linus!).
+ */
+ if (ULONG_CMP_GE(snap_done, (snap + 3) & ~0x1)) {
_rcu_barrier_trace(rsp, "EarlyExit", -1, snap_done);
smp_mb(); /* caller's subsequent code after above check. */
mutex_unlock(&rsp->barrier_mutex);
@@ -2947,6 +2960,7 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
rdp->blimit = blimit;
init_callback_list(rdp); /* Re-enable callbacks on this CPU. */
rdp->dynticks->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
+ rcu_sysidle_init_percpu_data(rdp->dynticks);
atomic_set(&rdp->dynticks->dynticks,
(atomic_read(&rdp->dynticks->dynticks) & ~0x1) + 1);
raw_spin_unlock(&rnp->lock); /* irqs remain disabled. */
@@ -3032,6 +3046,25 @@ static int rcu_cpu_notify(struct notifier_block *self,
return NOTIFY_OK;
}
+static int rcu_pm_notify(struct notifier_block *self,
+ unsigned long action, void *hcpu)
+{
+ switch (action) {
+ case PM_HIBERNATION_PREPARE:
+ case PM_SUSPEND_PREPARE:
+ if (nr_cpu_ids <= 256) /* Expediting bad for large systems. */
+ rcu_expedited = 1;
+ break;
+ case PM_POST_HIBERNATION:
+ case PM_POST_SUSPEND:
+ rcu_expedited = 0;
+ break;
+ default:
+ break;
+ }
+ return NOTIFY_OK;
+}
+
/*
* Spawn the kthread that handles this RCU flavor's grace periods.
*/
@@ -3273,6 +3306,7 @@ void __init rcu_init(void)
* or the scheduler are operational.
*/
cpu_notifier(rcu_cpu_notify, 0);
+ pm_notifier(rcu_pm_notify, 0);
for_each_online_cpu(cpu)
rcu_cpu_notify(NULL, CPU_UP_PREPARE, (void *)(long)cpu);
}
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index cbdeac6cea9..5f97eab602c 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -88,6 +88,14 @@ struct rcu_dynticks {
/* Process level is worth LLONG_MAX/2. */
int dynticks_nmi_nesting; /* Track NMI nesting level. */
atomic_t dynticks; /* Even value for idle, else odd. */
+#ifdef CONFIG_NO_HZ_FULL_SYSIDLE
+ long long dynticks_idle_nesting;
+ /* irq/process nesting level from idle. */
+ atomic_t dynticks_idle; /* Even value for idle, else odd. */
+ /* "Idle" excludes userspace execution. */
+ unsigned long dynticks_idle_jiffies;
+ /* End of last non-NMI non-idle period. */
+#endif /* #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
#ifdef CONFIG_RCU_FAST_NO_HZ
bool all_lazy; /* Are all CPU's CBs lazy? */
unsigned long nonlazy_posted;
@@ -545,6 +553,15 @@ static void rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp);
static void rcu_spawn_nocb_kthreads(struct rcu_state *rsp);
static void rcu_kick_nohz_cpu(int cpu);
static bool init_nocb_callback_list(struct rcu_data *rdp);
+static void rcu_sysidle_enter(struct rcu_dynticks *rdtp, int irq);
+static void rcu_sysidle_exit(struct rcu_dynticks *rdtp, int irq);
+static void rcu_sysidle_check_cpu(struct rcu_data *rdp, bool *isidle,
+ unsigned long *maxj);
+static bool is_sysidle_rcu_state(struct rcu_state *rsp);
+static void rcu_sysidle_report_gp(struct rcu_state *rsp, int isidle,
+ unsigned long maxj);
+static void rcu_bind_gp_kthread(void);
+static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp);
#endif /* #ifndef RCU_TREE_NONCORE */
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index dff86f53ee0..130c97b027f 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -28,7 +28,7 @@
#include <linux/gfp.h>
#include <linux/oom.h>
#include <linux/smpboot.h>
-#include <linux/tick.h>
+#include "time/tick-internal.h"
#define RCU_KTHREAD_PRIO 1
@@ -2373,3 +2373,425 @@ static void rcu_kick_nohz_cpu(int cpu)
smp_send_reschedule(cpu);
#endif /* #ifdef CONFIG_NO_HZ_FULL */
}
+
+
+#ifdef CONFIG_NO_HZ_FULL_SYSIDLE
+
+/*
+ * Define RCU flavor that holds sysidle state. This needs to be the
+ * most active flavor of RCU.
+ */
+#ifdef CONFIG_PREEMPT_RCU
+static struct rcu_state *rcu_sysidle_state = &rcu_preempt_state;
+#else /* #ifdef CONFIG_PREEMPT_RCU */
+static struct rcu_state *rcu_sysidle_state = &rcu_sched_state;
+#endif /* #else #ifdef CONFIG_PREEMPT_RCU */
+
+static int full_sysidle_state; /* Current system-idle state. */
+#define RCU_SYSIDLE_NOT 0 /* Some CPU is not idle. */
+#define RCU_SYSIDLE_SHORT 1 /* All CPUs idle for brief period. */
+#define RCU_SYSIDLE_LONG 2 /* All CPUs idle for long enough. */
+#define RCU_SYSIDLE_FULL 3 /* All CPUs idle, ready for sysidle. */
+#define RCU_SYSIDLE_FULL_NOTED 4 /* Actually entered sysidle state. */
+
+/*
+ * Invoked to note exit from irq or task transition to idle. Note that
+ * usermode execution does -not- count as idle here! After all, we want
+ * to detect full-system idle states, not RCU quiescent states and grace
+ * periods. The caller must have disabled interrupts.
+ */
+static void rcu_sysidle_enter(struct rcu_dynticks *rdtp, int irq)
+{
+ unsigned long j;
+
+ /* Adjust nesting, check for fully idle. */
+ if (irq) {
+ rdtp->dynticks_idle_nesting--;
+ WARN_ON_ONCE(rdtp->dynticks_idle_nesting < 0);
+ if (rdtp->dynticks_idle_nesting != 0)
+ return; /* Still not fully idle. */
+ } else {
+ if ((rdtp->dynticks_idle_nesting & DYNTICK_TASK_NEST_MASK) ==
+ DYNTICK_TASK_NEST_VALUE) {
+ rdtp->dynticks_idle_nesting = 0;
+ } else {
+ rdtp->dynticks_idle_nesting -= DYNTICK_TASK_NEST_VALUE;
+ WARN_ON_ONCE(rdtp->dynticks_idle_nesting < 0);
+ return; /* Still not fully idle. */
+ }
+ }
+
+ /* Record start of fully idle period. */
+ j = jiffies;
+ ACCESS_ONCE(rdtp->dynticks_idle_jiffies) = j;
+ smp_mb__before_atomic_inc();
+ atomic_inc(&rdtp->dynticks_idle);
+ smp_mb__after_atomic_inc();
+ WARN_ON_ONCE(atomic_read(&rdtp->dynticks_idle) & 0x1);
+}
+
+/*
+ * Unconditionally force exit from full system-idle state. This is
+ * invoked when a normal CPU exits idle, but must be called separately
+ * for the timekeeping CPU (tick_do_timer_cpu). The reason for this
+ * is that the timekeeping CPU is permitted to take scheduling-clock
+ * interrupts while the system is in system-idle state, and of course
+ * rcu_sysidle_exit() has no way of distinguishing a scheduling-clock
+ * interrupt from any other type of interrupt.
+ */
+void rcu_sysidle_force_exit(void)
+{
+ int oldstate = ACCESS_ONCE(full_sysidle_state);
+ int newoldstate;
+
+ /*
+ * Each pass through the following loop attempts to exit full
+ * system-idle state. If contention proves to be a problem,
+ * a trylock-based contention tree could be used here.
+ */
+ while (oldstate > RCU_SYSIDLE_SHORT) {
+ newoldstate = cmpxchg(&full_sysidle_state,
+ oldstate, RCU_SYSIDLE_NOT);
+ if (oldstate == newoldstate &&
+ oldstate == RCU_SYSIDLE_FULL_NOTED) {
+ rcu_kick_nohz_cpu(tick_do_timer_cpu);
+ return; /* We cleared it, done! */
+ }
+ oldstate = newoldstate;
+ }
+ smp_mb(); /* Order initial oldstate fetch vs. later non-idle work. */
+}
+
+/*
+ * Invoked to note entry to irq or task transition from idle. Note that
+ * usermode execution does -not- count as idle here! The caller must
+ * have disabled interrupts.
+ */
+static void rcu_sysidle_exit(struct rcu_dynticks *rdtp, int irq)
+{
+ /* Adjust nesting, check for already non-idle. */
+ if (irq) {
+ rdtp->dynticks_idle_nesting++;
+ WARN_ON_ONCE(rdtp->dynticks_idle_nesting <= 0);
+ if (rdtp->dynticks_idle_nesting != 1)
+ return; /* Already non-idle. */
+ } else {
+ /*
+ * Allow for irq misnesting. Yes, it really is possible
+ * to enter an irq handler then never leave it, and maybe
+ * also vice versa. Handle both possibilities.
+ */
+ if (rdtp->dynticks_idle_nesting & DYNTICK_TASK_NEST_MASK) {
+ rdtp->dynticks_idle_nesting += DYNTICK_TASK_NEST_VALUE;
+ WARN_ON_ONCE(rdtp->dynticks_idle_nesting <= 0);
+ return; /* Already non-idle. */
+ } else {
+ rdtp->dynticks_idle_nesting = DYNTICK_TASK_EXIT_IDLE;
+ }
+ }
+
+ /* Record end of idle period. */
+ smp_mb__before_atomic_inc();
+ atomic_inc(&rdtp->dynticks_idle);
+ smp_mb__after_atomic_inc();
+ WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks_idle) & 0x1));
+
+ /*
+ * If we are the timekeeping CPU, we are permitted to be non-idle
+ * during a system-idle state. This must be the case, because
+ * the timekeeping CPU has to take scheduling-clock interrupts
+ * during the time that the system is transitioning to full
+ * system-idle state. This means that the timekeeping CPU must
+ * invoke rcu_sysidle_force_exit() directly if it does anything
+ * more than take a scheduling-clock interrupt.
+ */
+ if (smp_processor_id() == tick_do_timer_cpu)
+ return;
+
+ /* Update system-idle state: We are clearly no longer fully idle! */
+ rcu_sysidle_force_exit();
+}
+
+/*
+ * Check to see if the current CPU is idle. Note that usermode execution
+ * does not count as idle. The caller must have disabled interrupts.
+ */
+static void rcu_sysidle_check_cpu(struct rcu_data *rdp, bool *isidle,
+ unsigned long *maxj)
+{
+ int cur;
+ unsigned long j;
+ struct rcu_dynticks *rdtp = rdp->dynticks;
+
+ /*
+ * If some other CPU has already reported non-idle, if this is
+ * not the flavor of RCU that tracks sysidle state, or if this
+ * is an offline or the timekeeping CPU, nothing to do.
+ */
+ if (!*isidle || rdp->rsp != rcu_sysidle_state ||
+ cpu_is_offline(rdp->cpu) || rdp->cpu == tick_do_timer_cpu)
+ return;
+ if (rcu_gp_in_progress(rdp->rsp))
+ WARN_ON_ONCE(smp_processor_id() != tick_do_timer_cpu);
+
+ /* Pick up current idle and NMI-nesting counter and check. */
+ cur = atomic_read(&rdtp->dynticks_idle);
+ if (cur & 0x1) {
+ *isidle = false; /* We are not idle! */
+ return;
+ }
+ smp_mb(); /* Read counters before timestamps. */
+
+ /* Pick up timestamps. */
+ j = ACCESS_ONCE(rdtp->dynticks_idle_jiffies);
+ /* If this CPU entered idle more recently, update maxj timestamp. */
+ if (ULONG_CMP_LT(*maxj, j))
+ *maxj = j;
+}
+
+/*
+ * Is this the flavor of RCU that is handling full-system idle?
+ */
+static bool is_sysidle_rcu_state(struct rcu_state *rsp)
+{
+ return rsp == rcu_sysidle_state;
+}
+
+/*
+ * Bind the grace-period kthread for the sysidle flavor of RCU to the
+ * timekeeping CPU.
+ */
+static void rcu_bind_gp_kthread(void)
+{
+ int cpu = ACCESS_ONCE(tick_do_timer_cpu);
+
+ if (cpu < 0 || cpu >= nr_cpu_ids)
+ return;
+ if (raw_smp_processor_id() != cpu)
+ set_cpus_allowed_ptr(current, cpumask_of(cpu));
+}
+
+/*
+ * Return a delay in jiffies based on the number of CPUs, rcu_node
+ * leaf fanout, and jiffies tick rate. The idea is to allow larger
+ * systems more time to transition to full-idle state in order to
+ * avoid the cache thrashing that otherwise occur on the state variable.
+ * Really small systems (less than a couple of tens of CPUs) should
+ * instead use a single global atomically incremented counter, and later
+ * versions of this will automatically reconfigure themselves accordingly.
+ */
+static unsigned long rcu_sysidle_delay(void)
+{
+ if (nr_cpu_ids <= CONFIG_NO_HZ_FULL_SYSIDLE_SMALL)
+ return 0;
+ return DIV_ROUND_UP(nr_cpu_ids * HZ, rcu_fanout_leaf * 1000);
+}
+
+/*
+ * Advance the full-system-idle state. This is invoked when all of
+ * the non-timekeeping CPUs are idle.
+ */
+static void rcu_sysidle(unsigned long j)
+{
+ /* Check the current state. */
+ switch (ACCESS_ONCE(full_sysidle_state)) {
+ case RCU_SYSIDLE_NOT:
+
+ /* First time all are idle, so note a short idle period. */
+ ACCESS_ONCE(full_sysidle_state) = RCU_SYSIDLE_SHORT;
+ break;
+
+ case RCU_SYSIDLE_SHORT:
+
+ /*
+ * Idle for a bit, time to advance to next state?
+ * cmpxchg failure means race with non-idle, let them win.
+ */
+ if (ULONG_CMP_GE(jiffies, j + rcu_sysidle_delay()))
+ (void)cmpxchg(&full_sysidle_state,
+ RCU_SYSIDLE_SHORT, RCU_SYSIDLE_LONG);
+ break;
+
+ case RCU_SYSIDLE_LONG:
+
+ /*
+ * Do an additional check pass before advancing to full.
+ * cmpxchg failure means race with non-idle, let them win.
+ */
+ if (ULONG_CMP_GE(jiffies, j + rcu_sysidle_delay()))
+ (void)cmpxchg(&full_sysidle_state,
+ RCU_SYSIDLE_LONG, RCU_SYSIDLE_FULL);
+ break;
+
+ default:
+ break;
+ }
+}
+
+/*
+ * Found a non-idle non-timekeeping CPU, so kick the system-idle state
+ * back to the beginning.
+ */
+static void rcu_sysidle_cancel(void)
+{
+ smp_mb();
+ ACCESS_ONCE(full_sysidle_state) = RCU_SYSIDLE_NOT;
+}
+
+/*
+ * Update the sysidle state based on the results of a force-quiescent-state
+ * scan of the CPUs' dyntick-idle state.
+ */
+static void rcu_sysidle_report(struct rcu_state *rsp, int isidle,
+ unsigned long maxj, bool gpkt)
+{
+ if (rsp != rcu_sysidle_state)
+ return; /* Wrong flavor, ignore. */
+ if (gpkt && nr_cpu_ids <= CONFIG_NO_HZ_FULL_SYSIDLE_SMALL)
+ return; /* Running state machine from timekeeping CPU. */
+ if (isidle)
+ rcu_sysidle(maxj); /* More idle! */
+ else
+ rcu_sysidle_cancel(); /* Idle is over. */
+}
+
+/*
+ * Wrapper for rcu_sysidle_report() when called from the grace-period
+ * kthread's context.
+ */
+static void rcu_sysidle_report_gp(struct rcu_state *rsp, int isidle,
+ unsigned long maxj)
+{
+ rcu_sysidle_report(rsp, isidle, maxj, true);
+}
+
+/* Callback and function for forcing an RCU grace period. */
+struct rcu_sysidle_head {
+ struct rcu_head rh;
+ int inuse;
+};
+
+static void rcu_sysidle_cb(struct rcu_head *rhp)
+{
+ struct rcu_sysidle_head *rshp;
+
+ /*
+ * The following memory barrier is needed to replace the
+ * memory barriers that would normally be in the memory
+ * allocator.
+ */
+ smp_mb(); /* grace period precedes setting inuse. */
+
+ rshp = container_of(rhp, struct rcu_sysidle_head, rh);
+ ACCESS_ONCE(rshp->inuse) = 0;
+}
+
+/*
+ * Check to see if the system is fully idle, other than the timekeeping CPU.
+ * The caller must have disabled interrupts.
+ */
+bool rcu_sys_is_idle(void)
+{
+ static struct rcu_sysidle_head rsh;
+ int rss = ACCESS_ONCE(full_sysidle_state);
+
+ if (WARN_ON_ONCE(smp_processor_id() != tick_do_timer_cpu))
+ return false;
+
+ /* Handle small-system case by doing a full scan of CPUs. */
+ if (nr_cpu_ids <= CONFIG_NO_HZ_FULL_SYSIDLE_SMALL) {
+ int oldrss = rss - 1;
+
+ /*
+ * One pass to advance to each state up to _FULL.
+ * Give up if any pass fails to advance the state.
+ */
+ while (rss < RCU_SYSIDLE_FULL && oldrss < rss) {
+ int cpu;
+ bool isidle = true;
+ unsigned long maxj = jiffies - ULONG_MAX / 4;
+ struct rcu_data *rdp;
+
+ /* Scan all the CPUs looking for nonidle CPUs. */
+ for_each_possible_cpu(cpu) {
+ rdp = per_cpu_ptr(rcu_sysidle_state->rda, cpu);
+ rcu_sysidle_check_cpu(rdp, &isidle, &maxj);
+ if (!isidle)
+ break;
+ }
+ rcu_sysidle_report(rcu_sysidle_state,
+ isidle, maxj, false);
+ oldrss = rss;
+ rss = ACCESS_ONCE(full_sysidle_state);
+ }
+ }
+
+ /* If this is the first observation of an idle period, record it. */
+ if (rss == RCU_SYSIDLE_FULL) {
+ rss = cmpxchg(&full_sysidle_state,
+ RCU_SYSIDLE_FULL, RCU_SYSIDLE_FULL_NOTED);
+ return rss == RCU_SYSIDLE_FULL;
+ }
+
+ smp_mb(); /* ensure rss load happens before later caller actions. */
+
+ /* If already fully idle, tell the caller (in case of races). */
+ if (rss == RCU_SYSIDLE_FULL_NOTED)
+ return true;
+
+ /*
+ * If we aren't there yet, and a grace period is not in flight,
+ * initiate a grace period. Either way, tell the caller that
+ * we are not there yet. We use an xchg() rather than an assignment
+ * to make up for the memory barriers that would otherwise be
+ * provided by the memory allocator.
+ */
+ if (nr_cpu_ids > CONFIG_NO_HZ_FULL_SYSIDLE_SMALL &&
+ !rcu_gp_in_progress(rcu_sysidle_state) &&
+ !rsh.inuse && xchg(&rsh.inuse, 1) == 0)
+ call_rcu(&rsh.rh, rcu_sysidle_cb);
+ return false;
+}
+
+/*
+ * Initialize dynticks sysidle state for CPUs coming online.
+ */
+static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp)
+{
+ rdtp->dynticks_idle_nesting = DYNTICK_TASK_NEST_VALUE;
+}
+
+#else /* #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
+
+static void rcu_sysidle_enter(struct rcu_dynticks *rdtp, int irq)
+{
+}
+
+static void rcu_sysidle_exit(struct rcu_dynticks *rdtp, int irq)
+{
+}
+
+static void rcu_sysidle_check_cpu(struct rcu_data *rdp, bool *isidle,
+ unsigned long *maxj)
+{
+}
+
+static bool is_sysidle_rcu_state(struct rcu_state *rsp)
+{
+ return false;
+}
+
+static void rcu_bind_gp_kthread(void)
+{
+}
+
+static void rcu_sysidle_report_gp(struct rcu_state *rsp, int isidle,
+ unsigned long maxj)
+{
+}
+
+static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp)
+{
+}
+
+#endif /* #else #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index 70f27e89012..3381f098070 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -134,6 +134,56 @@ config NO_HZ_FULL_ALL
Note the boot CPU will still be kept outside the range to
handle the timekeeping duty.
+config NO_HZ_FULL_SYSIDLE
+ bool "Detect full-system idle state for full dynticks system"
+ depends on NO_HZ_FULL
+ default n
+ help
+ At least one CPU must keep the scheduling-clock tick running for
+ timekeeping purposes whenever there is a non-idle CPU, where
+ "non-idle" also includes dynticks CPUs as long as they are
+ running non-idle tasks. Because the underlying adaptive-tick
+ support cannot distinguish between all CPUs being idle and
+ all CPUs each running a single task in dynticks mode, the
+ underlying support simply ensures that there is always a CPU
+ handling the scheduling-clock tick, whether or not all CPUs
+ are idle. This Kconfig option enables scalable detection of
+ the all-CPUs-idle state, thus allowing the scheduling-clock
+ tick to be disabled when all CPUs are idle. Note that scalable
+ detection of the all-CPUs-idle state means that larger systems
+ will be slower to declare the all-CPUs-idle state.
+
+ Say Y if you would like to help debug all-CPUs-idle detection.
+
+ Say N if you are unsure.
+
+config NO_HZ_FULL_SYSIDLE_SMALL
+ int "Number of CPUs above which large-system approach is used"
+ depends on NO_HZ_FULL_SYSIDLE
+ range 1 NR_CPUS
+ default 8
+ help
+ The full-system idle detection mechanism takes a lazy approach
+ on large systems, as is required to attain decent scalability.
+ However, on smaller systems, scalability is not anywhere near as
+ large a concern as is energy efficiency. The sysidle subsystem
+ therefore uses a fast but non-scalable algorithm for small
+ systems and a lazier but scalable algorithm for large systems.
+ This Kconfig parameter defines the number of CPUs in the largest
+ system that will be considered to be "small".
+
+ The default value will be fine in most cases. Battery-powered
+ systems that (1) enable NO_HZ_FULL_SYSIDLE, (2) have larger
+ numbers of CPUs, and (3) are suffering from battery-lifetime
+ problems due to long sysidle latencies might wish to experiment
+ with larger values for this Kconfig parameter. On the other
+ hand, they might be even better served by disabling NO_HZ_FULL
+ entirely, given that NO_HZ_FULL is intended for HPC and
+ real-time workloads that at present do not tend to be run on
+ battery-powered systems.
+
+ Take the default if you are unsure.
+
config NO_HZ
bool "Old Idle dynticks config"
depends on !ARCH_USES_GETTIMEOFFSET && GENERIC_CLOCKEVENTS
diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 37061ede8b8..bf2c8b1043d 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -381,19 +381,21 @@ void debug_object_init_on_stack(void *addr, struct debug_obj_descr *descr)
* debug_object_activate - debug checks when an object is activated
* @addr: address of the object
* @descr: pointer to an object specific debug description structure
+ * Returns 0 for success, -EINVAL for check failed.
*/
-void debug_object_activate(void *addr, struct debug_obj_descr *descr)
+int debug_object_activate(void *addr, struct debug_obj_descr *descr)
{
enum debug_obj_state state;
struct debug_bucket *db;
struct debug_obj *obj;
unsigned long flags;
+ int ret;
struct debug_obj o = { .object = addr,
.state = ODEBUG_STATE_NOTAVAILABLE,
.descr = descr };
if (!debug_objects_enabled)
- return;
+ return 0;
db = get_bucket((unsigned long) addr);
@@ -405,23 +407,26 @@ void debug_object_activate(void *addr, struct debug_obj_descr *descr)
case ODEBUG_STATE_INIT:
case ODEBUG_STATE_INACTIVE:
obj->state = ODEBUG_STATE_ACTIVE;
+ ret = 0;
break;
case ODEBUG_STATE_ACTIVE:
debug_print_object(obj, "activate");
state = obj->state;
raw_spin_unlock_irqrestore(&db->lock, flags);
- debug_object_fixup(descr->fixup_activate, addr, state);
- return;
+ ret = debug_object_fixup(descr->fixup_activate, addr, state);
+ return ret ? -EINVAL : 0;
case ODEBUG_STATE_DESTROYED:
debug_print_object(obj, "activate");
+ ret = -EINVAL;
break;
default:
+ ret = 0;
break;
}
raw_spin_unlock_irqrestore(&db->lock, flags);
- return;
+ return ret;
}
raw_spin_unlock_irqrestore(&db->lock, flags);
@@ -431,8 +436,11 @@ void debug_object_activate(void *addr, struct debug_obj_descr *descr)
* true or not.
*/
if (debug_object_fixup(descr->fixup_activate, addr,
- ODEBUG_STATE_NOTAVAILABLE))
+ ODEBUG_STATE_NOTAVAILABLE)) {
debug_print_object(&o, "activate");
+ return -EINVAL;
+ }
+ return 0;
}
/**