Concurrency

Concurrency is about executing multiple threads at once, either on a single core or multiple cores.

There are broadly two styles of concurrency:

Shared Memory

This is the more popular form of concurrency that involves sharing memory regions among multiple threads. We need to be able to guard those regions of memory to ensure exclusive access by guaranteeing mutual exclusion

Note: To understand implementation details of mutexes look at the atomics section below

Mutexes

Mutexes ensure exclusive access to a shared resource in a multi-threaded environment. A thread locks a mutex before accessing a shared resource and unlocks it after. Example:

pthread_mutex_lock(&mutex);
// Critical section
pthread_mutex_unlock(&mutex);

Condition Variables

Condition variables allow threads to wait for certain conditions. They’re used with mutexes to synchronize thread execution. Example:

pthread_cond_wait(&cond, &mutex); // Wait
pthread_cond_signal(&cond);       // Signal another thread

Semaphores

Semaphores manage access to shared resources, allowing multiple accesses. They’re particularly useful in signal handlers. Example:

sem_wait(&sem); // Decrement and wait if zero
sem_post(&sem); // Increment, potentially unblocking a waiter

Atomics

Atomics are a complicated low-level topic that form the foundation on top of which mutexes are built.

Atomic operation is an indivisible operation. There are different types of atomics in C++ like std::atomic_flag and std::atomic which is a template with different specialisations for each of the standard types.

Operations In Atomics

  1. load()
  2. store()
  3. exchange()
  4. compare_exchange_weak()
  5. compare_exchange_strong()

Memory Order Operations

  1. memory_order_relaxed
  2. memory_order_acquire
  3. memory_order_consume
  4. memory_order_acq_rel
  5. memory_order_release
  6. memory_order_seq_cst

Each atomic operation has a certain subset of memory orderings that can be specified for it. Atomic operations combined with memory ordering can create certain relationships.

For a small set of implementations using atomics in Rust, see here

Communicating Memory

This is the style of concurrent programming popularized by Go. It’s safer due to not locking regions of memory, thus avoiding classic problems like deadlocks. This style is also called Communicating Sequential Processes (CSP).

package main

import (
    "fmt"
    "time"
)

func worker(done chan bool) {
    fmt.Print("working...")
    time.Sleep(time.Second)
    fmt.Println("done")

    done <- true
}

func main() {
    done := make(chan bool, 1)
    go worker(done)

    <-done
}

The main thread blocks until the value of done is set to true.

MPSC in Rust is an example of Communicating Sequential Processes. The core principle of CSP is, “don’t communicate by sharing memory, share memory by communicating”.

CPS (Continuation Passing Style)

An alternative to CSP above is called CPS which is a complicated term for passing callbacks to functions. JS is a popular example of this.

function fetchData(url, callback) {
  // Simulating an asynchronous API call
  setTimeout(() => {
    const data = { id: 1, name: "John Doe" };
    callback(data);
  }, 1000);
}

function processData(data) {
  console.log("Processing data:", data);
}

fetchData("<https://example.com/api>", processData);

Blog post detailing this more: https://matt.might.net/articles/by-example-continuation-passing-style/

Promises are a nicer version of this CPS paradigm.

step1(5)
    .then(result1 => step2(result1))
    .then(result2 => step3(result2))
    .then(finalResult => console.log("Final result:", finalResult))
    .catch(error => console.error("An error occurred", error));

Async Runtimes, Web Servers & Async IO

This section is a little specific to Rust. Async runtimes are about managing concurrency in the context of an application - a somewhat higher level than mutexes and CSP.

Roughly what they do

Any runtime has the following core concepts:

  1. Scheduler
  2. Tasks & Futures
  3. Wakers
  4. Reactors

Scheduler

The scheduler roughly combines:

Tasks & Futures

Futures (atleast in Rust) are anything that implement the Future trait.

struct SharedState {
    completed: bool,
    waker: Option<Waker>,
}

pub struct TimerFuture {
    shared_state: Arc<Mutex<SharedState>>,
}

impl Future for TimerFuture {
    type Output = ();
    fn poll(
        self: std::pin::Pin<&mut Self>,
        cx: &mut std::task::Context<'_>,
    ) -> std::task::Poll<Self::Output> {
        let mut shared_state = self.shared_state.lock().unwrap();
        if shared_state.completed {
            return Poll::Ready(());
        }
        shared_state.waker = Some(cx.waker().clone());
        return Poll::Pending;
    }
}

Wakers

A waker is an object that can be used to put the task/future back onto a run queue.

impl ArcWake for Task {
    fn wake_by_ref(arc_self: &Arc<Self>) {
        let arc_clone = Arc::clone(&arc_self);
        arc_self.task_sender.send(arc_clone).unwrap();
    }
}

Reactors

Reactors are components that can react to the readiness of events. These are connected to the async runtime and can enqueue tasks onto the scheduler, once they are ready.

Mechanisms typically include poll , epoll (on Linux) & kqueue (on Unix). These are typically used to signal readiness of sockets so that events can be read from them.

It’s important to discuss here the difference between the different types of async io models - readiness based & completion based.

epoll and kqueue fall under the readiness based model. These syscalls only indicate when a file descriptor is available to be read from or written to. There’s the overhead of an additional syscall to actually do the writing or reading in this case. io_uring on the other hand falls under the completion based model. This model involves providing the file descriptors you’re interested in reading from or writing to and getting the actual results back. It avoids the overhead of the additional syscall. One interesting difference is that it makes no sense to use the readiness based model when doing file I/O. It doesn’t mean anything to wait for file “readiness” in this case since files are always ready for I/O.

Cooperative & Preemptive Schedulers

It’s important to discuss these two terms when talking about async runtimes. To understand this, consider the following two classes of coroutines:

  1. Stackful (have an OS-like stack allocated) -> also called green threads, fibers etc.
  2. Stackless (no call stack allocated) -> also called futures, promises etc.

Stackful coroutines can be pre-empted by the scheduler. Stackless coroutines cannot be pre-empted by the scheduler.

Mix and match above two combinations. stackless + cooperative → Rust futures, JS promises stackful + preemptive → Goroutines stackless + preemptive → isn’t possible stackful + cooperative → need examples

Stackless coroutines require the rewiring of the language to use “coloured” functions. Stackful coroutines don’t involve this. Bob Nystrom blog post

Styles Of Runtimes

Broadly:

  1. single-threaded
  2. work-stealing
  3. thread-per-core

Resources

External

Atomics & Memory Ordering

  1. Understanding Atomics And Memory Ordering
  2. The memory order reference from cppreference
  3. Atomics on the GCC Wiki
  4. Memory Ordering At Compile Time
  5. Memory Barriers From The Linux Kernel Documentation
  6. ArangoDB’s blog on memory barriers in C++
  7. The Danger Of Atomic Operations

Sync & Async Web Servers

  1. How Async Await Works In Python
  2. Concurrent Servers Part I
  3. Rust Book Final Project - Web Server
  4. Async Rust Book Final Project
  5. Too Many Web Servers

Aysnc Runtimes

Schedulers

  1. Tokio Scheduler
  2. How Tokio Schedules Tasks
  3. InfluxDB Using Tokio For CPU Bound Tasks
  4. Scheduling Internals

Reactors / Async IO

  1. Abstraction Over kqueue & iouring
  2. io_uring basics
  3. monoio implementation blog post

Miscellaneous

  1. Async I/O In Depth Playlist
  2. GitHub Repo For Above Playlist
  3. Withoutboats on thread-per-core [paper referenced in post]
  4. Thread-Per-Core In Async Rust
  5. Phil’s tweet comparing runtime approaches

Personal Blog

  1. Async Runtimes
  2. Async Runtimes II
  3. Async Runtimes III
  4. Atomics & Memory Ordering