Lock-Free Algorithms: Atomic Integer Operations


They are the building blocks for lock free algorithms. Modern CPUs supports them. They are slower than the non-atomic ones, but much faster than using locks.

Let's use compare-and-swap as an example. It compares a variable with another, if they are equal, then swaps them. It's like the following function:
int CompareAndSwap( int *ptr, int old, int new )
{
    int r = *ptr ;
    if ( *ptr == old )
        *ptr = new ;
    return r ;
}
The x86 archecture provides the cmpxchg instruction to perform this operation. The following assembly code provides an example on how to use it:
unsigned char CompareAndSwap( int *ptr, int oldvar, int newvar )
{
    unsigned char result;
    __asm__ __volatile__ (
    "lock; cmpxchg %3, %1\n"
    "sete %b0\n"
    : "=r"(result),
      "+m"(*ptr),
      "+a"(oldvar)
    : "r"(newvar)
    : "memory", "cc"
    );
    return result;
}
Well, these AT&T style assembly codes are hard to understand. Let's decrypt it. First, let's look at the cmpxchg instruction. It compares its second operant against the eax register, if they are equal, set the second operant to the value of the first operant.

In our example, the cmpxchg instruction compares %3 with eax, if they are equal, it will set %3 to %1. Then what is %3? It is the 3rd (zero based) line after the assembler template (i.e. the double-quoted string), i.e. "r"(newvar)"r" means any register, and(newvar) means set it to newvar. In other words, the whole %3 means "grab any register and set it to newvar before doing the cmpxchg". Note that the you can put valid C expressions inside the bracket, such as (newval + 1).

Similarly, %1 is referring to "+m"(*ptr), which is a memory location pointed by *ptr. Good. We want cmpxchg to modify it directly. The"+a"(oldvar) line specifies oldvar to be loaded to the eax register, which also takes parts in the cmpxchg instruction.

Finally, cmpxchg will set the Z flag if the comparison is equal. We want to use it as our return value. In order to know the value of the Z flag, we use the sete instruction, which will set the value of its operant to zero if the Z flag is clear, and 1 if the Z flag is set. In our example, the operant of sete is %b0. Without the b letter, it is just %0, i.e. "=r"(result) or "grab any register and set its value to result afterwards". Therefore, the C variable result will be 1 if the *ptr equals to newval and 0 otherwise. But what about the b between %0? It tells the assembler to grab a register of 1 byte long. Use %w0 if you want a word size register instead.

Ref:http://www.nestal.net/2008_06_01_archive.html
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章