Boost.Lockfree官方文檔翻譯

Boost_1_53_0終於迎來了久違的Boost.Lockfree模塊,本着學習的心態,將其翻譯如下。(原文地址:http://www.boost.org/doc/libs/1_53_0/doc/html/lockfree.html

 

Chapter 17. Boost.Lockfree

第17章.Boost.Lockfree

Table of Contents

目錄

Introduction& Motivation

簡介及動機

Examples

例子

Rationale

解釋

DataStructures

數據結構

MemoryManagement

內存管理

ABAPrevention

ABA預防

InterprocessSupport

進程間支持

Appendices

附錄

FutureDevelopments

未來發展

Introduction& Motivation

簡介和動機

Introduction & Terminology

簡介及術語

The term non-blocking denotes concurrent data structures,which do not use traditional synchronization primitives like guards to ensurethread-safety. Maurice Herlihy and Nir Shavit (compare "TheArt of Multiprocessor Programming")distinguish between 3 types of non-blocking data structures, each havingdifferent properties:

術語無阻塞表示併發數據結構,它們不使用傳統同步原語例如守衛者來保證線程安全。MauriceHerlihy和NirShavit(相比多處理器編程藝術)區分了三種類型的無阻塞數據結構,每種均具有不同的特性:

  • data structures are wait-free, if every concurrent operation is guaranteed to be finished in a finite number of steps. It is therefore possible to give worst-case guarantees for the number of operations.
  • 無等待數據結構,如果所有併發操作都保證會在有限步驟內完成。因此就有可能給出一個對操作數目的最壞保證。
  • data structures are lock-free, if some concurrent operations are guaranteed to be finished in a finite number of steps. While it is in theory possible that some operations never make any progress, it is very unlikely to happen in practical applications.
  • 無鎖數據結構,如果一些併發操作保證在有限步驟內完成。雖然理論上有些操作可能不會有任何進展,但實際應用中基本不太可能發生。
  • data structures are obstruction-free, if a concurrent operation is guaranteed to be finished in a finite number of steps, unless another concurrent operation interferes.
  • 無梗阻數據結構,如果除非被另外一個併發操作干預,否則一個併發操作保證在有限步驟內完成。

Some data structures can only be implemented in a lock-freemanner, if they are used under certain restrictions. The relevant aspects forthe implementation of boost.lockfreeare thenumber of producer and consumer threads. Single-producer (sp) or multiple producer (mp) means that only a single thread ormultiple concurrent threads are allowed to add data to a data structure. Single-consumer (sc) or Multiple-consumer (mc) denote the equivalent for the removalof data from the data structure.

如果使用在一定的限制條件下,一些數據結構只能被無鎖的方式實現。與boost.lockfree實現相對應的是生產者線程和消費者線程的數目。單生產者(sp)多生產者(mp)意味着只有一個線程或多個併發線程被允許添加數據至某數據結構中。單消費者(sc)多消費者(mc)則對應於從數據結構中移除數據。

Properties of Non-Blocking Data Structures

無阻塞數據結構的性質

Non-blocking data structures do not rely on locks and mutexes toensure thread-safety. The synchronization is done completely in user-spacewithout any direct interaction with the operating system [4].This implies that they are not prone to issues like priority inversion (alow-priority thread needs to wait for a high-priority thread).

無阻塞數據結構不依賴於鎖和互斥量來保證線程安全。同步完全在用戶空間中完成,而不需要與操作系統的任何直接交互。這意味着它們不容易出現例如優先反轉(低優先級線程需要等待高優先級線程)等問題。

Instead of relying on guards, non-blocking data structures require atomicoperations (specificCPU instructions executed without interruption). This means that any threadeither sees the state before or after the operation, but no intermediate statecan be observed. Not all hardware supports the same set of atomic instructions.If it is not available in hardware, it can be emulated in software usingguards. However this has the obvious drawback of losing the lock-free property.

無阻塞數據結構需要原子操作(特定的CPU執行指令不中斷)而不是依賴於“衛兵”。這意味着任何線程只能看到操作之前或之後的狀態,而不能觀察到任何中間狀態。不是所有的硬件都支持同樣的原子指令集。如果硬件上不支持,則可以在軟件上使用“衛兵”來模擬。但這因爲失去了無鎖的性質而具有明顯的缺陷,。

Performance of Non-Blocking Data Structures

無鎖數據結構的性能

When discussing the performance of non-blocking data structures,one has to distinguish between amortized and worst-case costs. The definition of 'lock-free'and 'wait-free' only mention the upper bound of an operation. Thereforelock-free data structures are not necessarily the best choice for every usecase. In order to maximise the throughput of an application one should considerhigh-performance concurrent data structures [5].

在討論無鎖數據結構性能時,首先應該區分“平均”“最壞情況”開銷。“無鎖”和“無等待”的定義僅提及了一個操作的上限。因此無鎖數據結構並不總是在任何情況下都是最好的選擇。爲了使應用程序的吞吐量最大化,你應該考慮高性能併發數據結構。

Lock-free data structures will be a better choice in order tooptimize the latency of a system or to avoid priority inversion, which may be necessaryin real-time applications. In general we advise to consider if lock-free datastructures are necessary or if concurrent data structures are sufficient. Inany case we advice to perform benchmarks with different data structures for aspecific workload.

在優化系統延時或避免優先反轉方面(在實時應用中可能需要這樣),無鎖數據結構將會是一個更好的選擇。一般來說我們建議考慮是否需要無鎖數據結構或者是否併發數據結構就夠了?在任何情況下,我們都建議對一個特定的工作量利用不同的數據結構來執行基準測試。

Sources of Blocking Behavior

阻塞行爲的來源

Apart from locks and mutexes (which we are not using in boost.lockfree anyway),there are three other aspects, that could violate lock-freedom:

除了鎖和互斥量(我們不會在boost.lockfree中使用它們),還有其他三處可能會違反鎖定自由的地方:

AtomicOperations

原子操作

Somearchitectures do not provide the necessary atomic operations in natively inhardware. If this is not the case, they are emulated in software usingspinlocks, which by itself is blocking.

有些系統架構在硬件層面不提供必需的原生原子操作。如果是這種情況,將會在軟件層面使用自旋鎖來模擬,這本身是阻塞的。

MemoryAllocations

內存分配

Allocatingmemory from the operating system is not lock-free. This makes it impossible toimplement true dynamically-sized non-blocking data structures. The node-baseddata structures of boost.lockfree usea memory pool to allocate the internal nodes. If this memory pool is exhausted,memory for new nodes has to be allocated from the operating system. However alldata structures of boost.lockfree canbe configured to avoid memory allocations (instead the specific calls will fail).This is especially useful for real-time systems that require lock-free memoryallocations.

從操作系統分配內存不是無鎖的。這使得不可能實現真正動態大小的無阻塞數據結構。boost.lockfree中基於節點的數據結構使用內存池來分配內部節點。如果內存池被耗盡,新節點的內存就需要從操作系統中分配。但是所有boost.lockfree中的數據結構都能配置爲避免內存分配(相對應的,某些調用將失敗)。這對那些需要無鎖內存分配的實時系統特別有用。

ExceptionHandling

異常處理

TheC++ exception handling does not give any guarantees about its real-timebehavior. We therefore do not encourage the use of exceptions and exceptionhandling in lock-free code.

C++異常處理對其實時性不做任何保證。因此我們不鼓勵在無鎖代碼中使用異常和異常處理。

Data Structures

數據結構

boost.lockfree implementsthree lock-free data structures:

boost.lockfree實現了三種無鎖數據結構:

boost::lockfree::queue

alock-free multi-produced/multi-consumer queue

一個無鎖的多生產者/多消費者隊列

boost::lockfree::stack

alock-free multi-produced/multi-consumer stack

一個無鎖的多生產者/多消費者棧

boost::lockfree::spsc_queue

await-free single-producer/single-consumer queue (commonly known as ringbuffer)

一個無等待的單生產者/單消費者隊列(通常被稱爲環形緩衝區)

Data Structure Configuration

數據結構配置

The data structures can be configured with Boost.Parameter-styletemplates:

數據結構能使用Boost.Parameter類型模板進行配置:

boost::lockfree::fixed_sized

Configuresthe data structure as fixed sized. Theinternal nodes are stored inside an array and they are addressed by arrayindexing. This limits the possible size of the queue to the number of elementsthat can be addressed by the index type (usually 2**16-2), but on platformsthat lack double-width compare-and-exchange instructions, this is the best wayto achieve lock-freedom.

配置數據結構爲固定大小。內部節點被存儲在一個數組內,並使用數組索引定位。這將隊列可能的大小限制在了能被索引類型映射的元素總數(通常是216-2),但是在缺少雙寬的比較交換(compare-and-exchange,注:一般是記爲compare-and-swap,CAS)指令的平臺上,這是實現無鎖的最好方式。

boost::lockfree::capacity

Setsthe capacity of a data structure at compile-time.This implies that a data structure is fixed-sized.

在編譯時設置數據結構的容量。這意味着數據結構是固定大小的。

boost::lockfree::allocator

Definesthe allocator. boost.lockfree supportsstateful allocator and is compatible with Boost.Interprocess allocators.

定義分配器。boost.lockfree支持具狀態分配器,並且與Boost.Interprocess的分配器兼容。

 

Examples

示例

Queue

隊列

The boost::lockfree::queue classimplements a multi-writer/multi-reader queue. The following example shows howinteger values are produced and consumed by 4 threads each:

類 boost::lockfree::queue 實現了一個多寫入/多讀取隊列。下面的例子展示瞭如何產生整數,並被4個線程分別消費:

#include <boost/thread/thread.hpp>
#include <boost/lockfree/queue.hpp>
#include <iostream>

#include <boost/atomic.hpp>

boost::atomic_int producer_count(0);
boost::atomic_int consumer_count(0);

boost::lockfree::queue<int> queue(128);

const int iterations = 10000000;
const int producer_thread_count = 4;
const int consumer_thread_count = 4;

void producer(void)
{
    for (int i = 0; i != iterations; ++i) {
        int value = ++producer_count;
        while (!queue.push(value))
            ;
    }
}

boost::atomic<bool> done (false);
void consumer(void)
{
    int value;
    while (!done) {
        while (queue.pop(value))
            ++consumer_count;
    }

    while (queue.pop(value))
        ++consumer_count;
}

int main(int argc, char* argv[])
{
    using namespace std;
    cout << "boost::lockfree::queue is ";
    if (!queue.is_lock_free())
        cout << "not ";
    cout << "lockfree" << endl;

    boost::thread_group producer_threads, consumer_threads;

    for (int i = 0; i != producer_thread_count; ++i)
        producer_threads.create_thread(producer);

    for (int i = 0; i != consumer_thread_count; ++i)
        consumer_threads.create_thread(consumer);

    producer_threads.join_all();
    done = true;

    consumer_threads.join_all();

    cout << "produced " << producer_count << " objects." << endl;
    cout << "consumed " << consumer_count << " objects." << endl;
}

The program output is:

程序輸出:

produced 40000000 objects.
consumed 40000000 objects.

Stack

The boost::lockfree::stack classimplements a multi-writer/multi-reader stack. The following example shows howinteger values are produced and consumed by 4 threads each:

boost::lockfree::stack實現了一個多寫入/多讀取棧。下面的例子展示瞭如何產生整數,並被4個線程分別消費:

#include <boost/thread/thread.hpp>
#include <boost/lockfree/stack.hpp>
#include <iostream>

#include <boost/atomic.hpp>

boost::atomic_int producer_count(0);
boost::atomic_int consumer_count(0);

boost::lockfree::stack<int> stack(128);

const int iterations = 1000000;
const int producer_thread_count = 4;
const int consumer_thread_count = 4;

void producer(void)
{
    for (int i = 0; i != iterations; ++i) {
        int value = ++producer_count;
        while (!stack.push(value))
            ;
    }
}

boost::atomic<bool> done (false);

void consumer(void)
{
    int value;
    while (!done) {
        while (stack.pop(value))
            ++consumer_count;
    }

    while (stack.pop(value))
        ++consumer_count;
}

int main(int argc, char* argv[])
{
    using namespace std;
    cout << "boost::lockfree::stack is ";
    if (!stack.is_lock_free())
        cout << "not ";
    cout << "lockfree" << endl;

    boost::thread_group producer_threads, consumer_threads;

    for (int i = 0; i != producer_thread_count; ++i)
        producer_threads.create_thread(producer);

    for (int i = 0; i != consumer_thread_count; ++i)
        consumer_threads.create_thread(consumer);

    producer_threads.join_all();
    done = true;

    consumer_threads.join_all();

    cout << "produced " << producer_count << " objects." << endl;
    cout << "consumed " << consumer_count << " objects." << endl;
}

The program output is:

程序輸出:

produced 4000000 objects.
consumed 4000000 objects.

Waitfree Single-Producer/Single-Consumer Queue

無等待單生產者/單消費者隊列

The boost::lockfree::spsc_queue classimplements a wait-free single-producer/single-consumer queue. The followingexample shows how integer values are produced and consumed by 2 separatethreads:

boost::lockfree::spsc_queue實現了一個無等待的單生產者/單消費者隊列。下面的例子展示瞭如何產生整數,並被2個單獨的線程消費:
#include <boost/thread/thread.hpp>
#include <boost/lockfree/spsc_queue.hpp>
#include <iostream>

#include <boost/atomic.hpp>

int producer_count = 0;
boost::atomic_int consumer_count (0);

boost::lockfree::spsc_queue<int, boost::lockfree::capacity<1024> > spsc_queue;

const int iterations = 10000000;

void producer(void)
{
    for (int i = 0; i != iterations; ++i) {
        int value = ++producer_count;
        while (!spsc_queue.push(value))
            ;
    }
}

boost::atomic<bool> done (false);

void consumer(void)
{
    int value;
    while (!done) {
        while (spsc_queue.pop(value))
            ++consumer_count;
    }

    while (spsc_queue.pop(value))
        ++consumer_count;
}

int main(int argc, char* argv[])
{
    using namespace std;
    cout << "boost::lockfree::queue is ";
    if (!spsc_queue.is_lock_free())
        cout << "not ";
    cout << "lockfree" << endl;

    boost::thread producer_thread(producer);
    boost::thread consumer_thread(consumer);

    producer_thread.join();
    done = true;
    consumer_thread.join();

    cout << "produced " << producer_count << " objects." << endl;
    cout << "consumed " << consumer_count << " objects." << endl;
}

The program output is:

程序輸出:

produced 10000000 objects.
consumed 10000000 objects.

Rationale

解釋

DataStructures

數據結構

MemoryManagement

內存分配

ABAPrevention

ABA阻止

InterprocessSupport

進程間支持

Data Structures

數據結構

The implementations are implementations of well-known datastructures. The queue is based on Simple, Fast, and Practical Non-Blocking and Blocking ConcurrentQueue Algorithms by Michael Scott and Maged Michael,the stack is based on Systemsprogramming: coping with parallelism by R. K. Treiber andthe spsc_queue is considered as 'folklore' and is implemented in severalopen-source projects including the linux kernel. All data structures arediscussed in detail in "TheArt of Multiprocessor Programming" by Herlihy & Shavit.

該實現是著名的數據結構的實現。隊列是基於MichaelScott和MagedMichael提出的簡單、快速且實用的無阻塞和阻塞併發隊列算法,是基於R. K.Treiber的《系統編程:併發處理》,特殊隊列(spsc_queue)被認爲是“民間傳說(folklore)”,它是被幾個開源項目實現的,其中包括linux內核。所有的數據結構都在Herlihy& Shavit的《多處理器編程藝術》中被詳細討論。

Memory Management

內存管理

The lock-free boost::lockfree::queue and boost::lockfree::stack classesare node-based data structures, based on a linked list. Memory management oflock-free data structures is a non-trivial problem, because we need to avoidthat one thread frees an internal node, while another thread still uses it. boost.lockfree usesa simple approach not returning any memory to the operating system. Insteadthey maintain a free-list in order to reuse them later. This isdone for two reasons: first, depending on the implementation of the memoryallocator freeing the memory may block (so the implementation would not belock-free anymore), and second, most memory reclamation algorithms arepatented.

無鎖的boost::lockfree::queueboost::lockfree::stack是基於節點的數據結構,它們基於一個鏈表。無鎖數據結構的內存管理是一個不平凡的問題,因爲我們需要避免一個線程釋放了一個內部節點,但另一個線程仍然在使用它的情況。Boost.lockfree使用了一個簡單的方法不歸還任何內存至操作系統。相反,它們維護了一個空鏈表以便之後再使用它們。這樣做是出於兩個原因:首先,依賴於內存分配器的實現釋放內存,可能會阻塞(因此該實現將不再無鎖),其次,大多數內存回收算法都是具有專利的。

ABA Prevention

ABA預防

The ABA problem is a common problem when implementing lock-freedata structures. The problem occurs when updating an atomic variable using a compare_exchange operation:if the value A was read, thread 1 changes it to say C and tries to update thevariable, it uses compare_exchange towrite C, only if the current value is A. This might be a problem if in themeanwhile thread 2 changes the value from A to B and back to A, because thread1 does not observe the change of the state. The common way to avoid the ABAproblem is to associate a version counter with the value and change bothatomically.

ABA問題是實現無鎖數據結構的一個常見問題。當使用比較交換運算更新一個原子變量時,問題就會出現:如果值A被讀取,線程1試圖將它改爲C並嘗試更新該變量,它使用比較交換來寫C,僅噹噹前值爲A時。如果同時線程2將值從A變爲B再變爲A,這將是個問題,因爲線程1沒有觀察到狀態的改變(具體可參考:http://hustpawpaw.blog.163.com/blog/static/184228324201210811243127/)。通常避免ABA問題的方法是關聯一個版本計數器至該值,並且一起原子的變化。

boost.lockfree usesa tagged_ptr helperclass which associates a pointer with an integer tag. This usually requires adouble-width compare_exchange, whichis not available on all platforms. IA32 did not provide the cmpxchg8b opcodebefore the pentium processor and it is also lacking on many RISC architectureslike PPC. Early X86-64 processors also did not provide a cmpxchg16b instruction.On 64bit platforms one can work around this issue, because often not the full64bit address space is used. On X86_64 for example, only 48bit are used for theaddress, so we can use the remaining 16bit for the ABA prevention tag. Fordetails please consult the implementation of theboost::lockfree::detail::tagged_ptr class.

boost.lockfree使用了一個tagged_ptr助手類,它使用一整數標籤關聯了一個指針。這通常需要一個雙寬的比較交換,該操作並非在所有的平臺上都可用。IA32在奔騰處理器之前不提供cmpxchg8b操作碼,並且它也缺少許多RISC架構例如PPC。早期的X86-64處理器也不提供cmpxchg16b 指令。在64位平臺上可以解決這個問題,因爲經常並非完整的64位地址空間都被使用。例如在X86-64平臺上,僅僅使用了地址空間的48位,因此我們可以使用剩下的16位來做爲ABA預防標籤。具體細節請參考類boost::lockfree::detail::tagged_ptr 的實現。

For lock-free operations on 32bit platforms without double-width compare_exchange, wesupport a third approach: by using a fixed-sized array to store the internalnodes we can avoid the use of 32bit pointers, but instead 16bit indices intothe array are sufficient. However this is only possible for fixed-sized datastructures, that have an upper bound of internal nodes.

對不具有雙寬比較交換的32位平臺上的無鎖操作,我們支持第三種方法:我們可以通過使用固定大小的數組來存儲內部節點,從而避免使用32位指針,因此使用16位索引至數組就足夠了。然而這僅對固定大小的數據結構可行,它們有一個內部節點的上限。

Interprocess Support

進程間支持

The boost.lockfree datastructures have basic support for Boost.Interprocess. Theonly problem is the blocking emulation of lock-free atomics, which in thecurrent implementation is not guaranteed to be interprocess-safe.

boost.lockfree數據結構具有對Boost.Interprocess的基本支持。唯一的問題在於對無鎖原子的阻塞模擬,這在當前實現中是不保證進程安全的。

 

Future Developments

未來發展

  • More data structures (set, hash table, dequeue)
  • 更多的數據結構(集合,哈希表,雙端隊列)
  • Backoff schemes (exponential backoff or elimination)
  • 退避計劃(指數退避或消除)
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章