Atomic Operations: Lock-Free Synchronization

The <atomic> header provides atomic operations—indivisible operations that appear to occur instantaneously to other threads, without requiring mutexes.

std::atomic Basics

#include <atomic>

std::atomic<int> counter{0};

void increment() {
    for (int i = 0; i < 100000; ++i) {
        ++counter;  // Atomic increment (no mutex needed)
    }
}

std::thread t1(increment);
std::thread t2(increment);
t1.join();
t2.join();

std::cout << counter;  // Guaranteed to be 200000

Atomic Operations

std::atomic<int> x{10};

x.store(20);           // Atomic write
int val = x.load();    // Atomic read
int old = x.exchange(30);  // Atomic swap

// Compare-and-swap
int expected = 30;
bool success = x.compare_exchange_strong(expected, 40);
// If x == expected, sets x = 40 and returns true
// Otherwise, sets expected = x and returns false

Lock-Free Property

std::atomic<int> counter{0};

if (counter.is_lock_free()) {
    std::cout << "True atomic (no locks)\\n";
} else {
    std::cout << "Uses mutex internally\\n";
}

// Compile-time check (C++17)
static_assert(std::atomic<int>::is_always_lock_free);

Most standard types are lock-free on modern architectures:

atomic<int>, atomic<long>: Usually lock-free
atomic<std::string>: Not lock-free (too large)
atomic<bool>, atomic<char>: Always lock-free

Memory Ordering

The C++ memory model defines how operations on different threads are ordered. Six memory ordering modes:

1. memory_order_relaxed

No synchronization, only atomicity:

std::atomic<int> x{0}, y{0};

// Thread 1
x.store(1, std::memory_order_relaxed);
y.store(2, std::memory_order_relaxed);

// Thread 2
int a = y.load(std::memory_order_relaxed);
int b = x.load(std::memory_order_relaxed);

// a=2, b=0 is possible! (reordering allowed)

Use: Independent counters, statistics

2. memory_order_acquire and memory_order_release

Establishes happens-before relationship:

std::atomic<bool> ready{false};
int data = 0;

// Producer thread
data = 42;  // Non-atomic write
ready.store(true, std::memory_order_release);  // Release

// Consumer thread
while (!ready.load(std::memory_order_acquire)) {}  // Acquire
std::cout << data;  // Guaranteed to see 42

Release-acquire pair synchronizes: all writes before release are visible after acquire.

3. memory_order_acq_rel

Both acquire and release:

std::atomic<int> x{0};

int prev = x.fetch_add(1, std::memory_order_acq_rel);
// Acquires previous value, releases new value

4. memory_order_seq_cst (Default)

Sequentially consistent: total global order:

std::atomic<int> x{0}, y{0};

// Thread 1
x.store(1);  // Default: memory_order_seq_cst
y.store(2);

// Thread 2
int a = y.load();
int b = x.load();

// If a == 2, then b must == 1 (total order guaranteed)

Most expensive but easiest to reason about. Default for most operations.

5. memory_order_consume (Deprecated)

Like acquire but weaker (rarely used, complex semantics).

Compare-Exchange

Fundamental building block for lock-free algorithms:

std::atomic<int> value{0};

void increment() {
    int expected = value.load();
    while (!value.compare_exchange_weak(expected, expected + 1)) {
        // expected was updated to current value, retry
    }
}

Strong vs Weak CAS

// Strong: Only fails if value != expected
compare_exchange_strong(expected, desired);

// Weak: May spuriously fail even if value == expected
// (Faster on some architectures)
compare_exchange_weak(expected, desired);

Use _weak in loops (spurious failures don’t matter).

Atomic Flags

Simplest atomic type, guaranteed lock-free:

std::atomic_flag flag = ATOMIC_FLAG_INIT;

bool was_set = flag.test_and_set();  // Atomically set to true, return old value
flag.clear();  // Set to false

Spinlock Implementation

class spinlock {
    std::atomic_flag flag_ = ATOMIC_FLAG_INIT;
public:
    void lock() {
        while (flag_.test_and_set(std::memory_order_acquire)) {
            // Spin until lock acquired
        }
    }
    
    void unlock() {
        flag_.clear(std::memory_order_release);
    }
};

Warning: Spinlocks waste CPU cycles. Use for very short critical sections only.

Lock-Free Stack Example

template<typename T>
class lock_free_stack {
    struct Node {
        T data;
        Node* next;
    };
    
    std::atomic<Node*> head_{nullptr};
    
public:
    void push(const T& value) {
        Node* new_node = new Node{value, head_.load()};
        
        while (!head_.compare_exchange_weak(new_node->next, new_node)) {
            // Retry with updated head
        }
    }
    
    bool pop(T& result) {
        Node* old_head = head_.load();
        
        while (old_head && !head_.compare_exchange_weak(old_head, old_head->next)) {
            // Retry
        }
        
        if (old_head) {
            result = old_head->data;
            delete old_head;
            return true;
        }
        return false;
    }
};

Caution: Lock-free programming is extremely difficult. Prefer mutexes unless profiling shows they’re a bottleneck.

When to Use Atomics vs Mutexes

Use Atomics When	Use Mutexes When
Single variable	Multiple variables
Simple operations	Complex critical sections
Performance critical	Simplicity matters
Lock-free required	Standard case

Conceptual Check

What is the default memory ordering for atomic operations?

Runtime Environment

Interactive Lab

1#include <iostream>

2#include <atomic>

3#include <thread>

4#include <vector>

6std::atomic<long> counter{0};

8void increment(int count) {

9 for (int i = 0; i < count; ++i) {

10 counter.fetch_add(1, std::memory_order_relaxed);

11 }

12}

14int main() {

15 std::cout << "atomic<long> is lock-free: "

16 << std::boolalpha << counter.is_lock_free() << '\n';

18 std::vector<std::thread> threads;

19 for (int i = 0; i < 4; ++i) {

20 threads.emplace_back(increment, 25000);

21 }

23 for (auto& t : threads) t.join();

25 std::cout << "Counter: " << counter << '\n';

26 return 0;

27}

System Console

Waiting for signal...

Atomic Operations and Memory Ordering

Atomic Operations: Lock-Free Synchronization

std::atomic Basics