Lock-Free Algorithms: Atomic Integer Operations

They are the building blocks for lock free algorithms. Modern CPUs supports them. They are slower than the non-atomic ones, but much faster than using locks.

Let's use compare-and-swap as an example. It compares a variable with another, if they are equal, then swaps them. It's like the following function:

int CompareAndSwap( int *ptr, int old, int new )
{
    int r = *ptr ;
    if ( *ptr == old )
        *ptr = new ;
    return r ;
}

The x86 archecture provides the cmpxchg instruction to perform this operation. The following assembly code provides an example on how to use it:

unsigned char CompareAndSwap( int *ptr, int oldvar, int newvar )
{
    unsigned char result;
    __asm__ __volatile__ (
    "lock; cmpxchg %3, %1\n"
    "sete %b0\n"
    : "=r"(result),
      "+m"(*ptr),
      "+a"(oldvar)
    : "r"(newvar)
    : "memory", "cc"
    );
    return result;
}

Well, these AT&T style assembly codes are hard to understand. Let's decrypt it. First, let's look at the cmpxchg instruction. It compares its second operant against the eax register, if they are equal, set the second operant to the value of the first operant.

In our example, the cmpxchg instruction compares %3 with eax, if they are equal, it will set %3 to %1. Then what is %3? It is the 3rd (zero based) line after the assembler template (i.e. the double-quoted string), i.e. "r"(newvar). "r" means any register, and(newvar) means set it to newvar. In other words, the whole %3 means "grab any register and set it to newvar before doing the cmpxchg". Note that the you can put valid C expressions inside the bracket, such as (newval + 1).

Similarly, %1 is referring to "+m"(*ptr), which is a memory location pointed by *ptr. Good. We want cmpxchg to modify it directly. The"+a"(oldvar) line specifies oldvar to be loaded to the eax register, which also takes parts in the cmpxchg instruction.

Finally, cmpxchg will set the Z flag if the comparison is equal. We want to use it as our return value. In order to know the value of the Z flag, we use the sete instruction, which will set the value of its operant to zero if the Z flag is clear, and 1 if the Z flag is set. In our example, the operant of sete is %b0. Without the b letter, it is just %0, i.e. "=r"(result) or "grab any register and set its value to result afterwards". Therefore, the C variable result will be 1 if the *ptr equals to newval and 0 otherwise. But what about the b between %0? It tells the assembler to grab a register of 1 byte long. Use %w0 if you want a word size register instead.

Ref:http://www.nestal.net/2008_06_01_archive.html

Lock-Free Algorithms: Atomic Integer Operations

「Pygors跨平臺GUI」1：Pygors跨平臺GUI應用研究

[轉帖]

python列出centos7內存使用前50的進程信息

「Pygors跨平臺GUI」2：安裝MinGW-w64、MSYS2還是WSL2

一鍵自動化博客發佈工具,用過的人都說好(掘金篇)

通義千問 2.5 “客串” ChatGPT4，你分的清嗎？

Garnet：微軟官方基於.NET開源的高性能分佈式緩存存儲數據庫

Flink執行圖

Java響應式編程

評估統計算法在銀行僞造鈔票檢測中的價值

NDIS_NBL_FLAGS_IS_LOOPBACK_PACKET Flag

Life of an instruction in LLVM

Microkernel和VMM的比較

Lock-Free Algorithms: Atomic Integer Operations

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結