add high level dox
This was SVN commit r2182.
This commit is contained in:
parent
faf5a942dc
commit
d91d5a2180
@ -1,5 +1,100 @@
|
||||
// lock-free primitives and algorithms
|
||||
//
|
||||
// Copyright (c) 2005 Jan Wassenberg
|
||||
//
|
||||
// This program is free software; you can redistribute it and/or
|
||||
// modify it under the terms of the GNU General Public License as
|
||||
// published by the Free Software Foundation; either version 2 of the
|
||||
// License, or (at your option) any later version.
|
||||
//
|
||||
// This program is distributed in the hope that it will be useful, but
|
||||
// WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
// General Public License for more details.
|
||||
//
|
||||
// Contact info:
|
||||
// Jan.Wassenberg@stud.uni-karlsruhe.de
|
||||
// http://www.stud.uni-karlsruhe.de/~urkt/
|
||||
|
||||
#include "posix_types.h" // uintptr_t
|
||||
|
||||
/*
|
||||
|
||||
[KEEP IN SYNC WITH WIKI]
|
||||
|
||||
overview
|
||||
--------
|
||||
|
||||
this module provides several implicitly thread-safe data structures.
|
||||
rather than allowing only one thread to access them at a time, their
|
||||
operations are carefully implemented such that they take effect in
|
||||
one atomic step. data consistency problems are thus avoided.
|
||||
this novel approach to synchronization has several advantages:
|
||||
- deadlocks are impossible;
|
||||
- overhead due to OS kernel entry is avoided;
|
||||
- graceful scaling to multiple processors is ensured.
|
||||
|
||||
|
||||
mechanism
|
||||
---------
|
||||
|
||||
the basic primitive that makes this possible is "compare and swap",
|
||||
a CPU instruction that performs both steps atomically. it compares a
|
||||
machine word against the expected value; if equal, the new value is
|
||||
written and an indication returned. otherwise, another thread must have
|
||||
been writing to the same location; the operation is typically retried.
|
||||
|
||||
this instruction is available on all modern architectures; in some cases,
|
||||
emulation in terms of an alternate primitive (LL/SC) is necessary.
|
||||
|
||||
|
||||
memory management
|
||||
-----------------
|
||||
|
||||
one major remaining problem is how to free no longer needed nodes in the
|
||||
data structure. in general, we want to reclaim their memory for arbitrary use;
|
||||
this isn't safe as long as other threads are still accessing them.
|
||||
|
||||
the RCU algorithm recognizes that all CPUs having entered a quiescent
|
||||
state means that no threads are still referencing data.
|
||||
lacking such kernel support, we use a similar mechanism - "hazard pointers"
|
||||
are set before accessing data; only if none are pointing to a node can it
|
||||
be freed. until then, they are stored in a per-thread 'waiting list'.
|
||||
|
||||
this approach has several advantages over previous algorithms
|
||||
(typically involving reference count): the CAS primitive need only
|
||||
operate on single machine words, and space/time overhead is much reduced.
|
||||
|
||||
|
||||
usage notes
|
||||
-----------
|
||||
|
||||
useful "payload" in the data structures is allocated when inserting each
|
||||
item: additional_bytes are appended. rationale: see struct Node definition.
|
||||
|
||||
since lockless algorithms are subtle and easy to get wrong, an extensive
|
||||
self-test is included; #define PERFORM_SELF_TEST 1 to activate.
|
||||
|
||||
|
||||
terminology
|
||||
-----------
|
||||
|
||||
"atomic" means indivisible; in this case, other CPUs cannot
|
||||
interfere with such an operation.
|
||||
"race conditions" are potential data consistency
|
||||
problems resulting from lack of thread synchronization.
|
||||
"deadlock" is a state where several threads are waiting on
|
||||
one another and no progress is possible.
|
||||
"thread-safety" is understood to mean the preceding
|
||||
two problems do not occur.
|
||||
"scalability" is a measure of how efficient synchronization is;
|
||||
overhead should not increase significantly with more processors.
|
||||
"linearization point" denotes the time at which an external
|
||||
observer believes a lockless operation to have taken effect.
|
||||
|
||||
*/
|
||||
|
||||
|
||||
// atomic "compare and swap". compare the machine word at <location> against
|
||||
// <expected>; if not equal, return false; otherwise, overwrite it with
|
||||
// <new_value> and return true.
|
||||
|
Loading…
Reference in New Issue
Block a user