# Lock-Free Programming - Java (Part 6)

The Treiber Stack was our warm-up. Now for the main event: the **Michael-Scott Queue** — arguably the most important lock-free data structure ever designed. Published by Mick Michael and Michael Scott in 1996, it's the algorithm behind Java's `ConcurrentLinkedQueue` and has influenced every lock-free queue since.

A queue is harder than a stack because it has **two ends**: items go in at the tail and come out at the head. That means we need **two** atomic pointers working in harmony — and that's where things get interesting.

* * *

## Queues vs Stacks: Why Queues Are Harder

A stack has one "hot spot" — the top pointer. A queue has two:

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/f82d3c69-1bce-4f8e-9d3d-6ed7179b12fc.png align="center")

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/5fd09ad6-eef4-422b-a209-def5ff019a48.png align="center")

The challenge: `enqueue` modifies the **tail**, `dequeue` modifies the **head**, and the last node's `next` pointer bridges them. We need all three to stay consistent without locks.

* * *

## The Sentinel Node Trick

The Michael-Scott Queue uses a clever trick: **a dummy (sentinel) node**. The queue always starts with an empty node that doesn't contain real data. This eliminates tricky edge cases when the queue is empty.

### Empty Queue

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/92e34d7a-58a0-454f-ae48-eae5e2212fcb.png align="center")

Both `head` and `tail` point to the same sentinel node. The queue is "empty" even though there's one node.

### After enqueue("A")

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/6bbca372-2b26-4243-9b59-0cd64ccbbe06.png align="center")

### After enqueue("B")

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/3478b6f7-293d-4daf-a322-36a276edb692.png align="center")

### After dequeue() → returns "A"

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/9399d0e7-aa90-4de6-a831-0a15cff5229d.png align="center")

Notice: when we dequeue, the **old sentinel is discarded** and the dequeued node **becomes the new sentinel**. Its value has been consumed; it now serves as the divider between "already consumed" and "available."

* * *

## The Full Implementation

```java
import java.util.concurrent.atomic.AtomicReference;

public class MichaelScottQueue<T> {
    
    private static class Node<T> {
        final T value;
        final AtomicReference<Node<T>> next;
        
        Node(T value) {
            this.value = value;
            this.next = new AtomicReference<>(null);
        }
    }
    
    private final AtomicReference<Node<T>> head;
    private final AtomicReference<Node<T>> tail;
    
    public MichaelScottQueue() {
        Node<T> sentinel = new Node<>(null);  // Dummy node
        head = new AtomicReference<>(sentinel);
        tail = new AtomicReference<>(sentinel);
    }
    
    /**
     * Enqueue: Add an item to the tail of the queue.
     * Lock-free: never blocks.
     */
    public void enqueue(T value) {
        Node<T> newNode = new Node<>(value);
        
        while (true) {
            Node<T> currentTail = tail.get();
            Node<T> tailNext = currentTail.next.get();
            
            // Is tail still the last node?
            if (currentTail == tail.get()) {  // Re-check consistency
                
                if (tailNext == null) {
                    // Tail IS the last node. Try to link our new node after it.
                    if (currentTail.next.compareAndSet(null, newNode)) {
                        // Linked! Now try to swing tail forward.
                        tail.compareAndSet(currentTail, newNode);
                        return;  // Enqueue complete
                    }
                } else {
                    // Tail is NOT the last node — another enqueue is in progress!
                    // Help it by advancing tail forward.
                    tail.compareAndSet(currentTail, tailNext);
                }
            }
        }
    }
    
    /**
     * Dequeue: Remove and return the item at the head of the queue.
     * Returns null if the queue is empty.
     * Lock-free: never blocks.
     */
    public T dequeue() {
        while (true) {
            Node<T> currentHead = head.get();
            Node<T> currentTail = tail.get();
            Node<T> headNext = currentHead.next.get();
            
            // Is head still consistent?
            if (currentHead == head.get()) {
                
                if (currentHead == currentTail) {
                    // Head and tail point to the same node
                    if (headNext == null) {
                        return null;  // Queue is truly empty
                    }
                    // Tail is lagging behind — advance it
                    tail.compareAndSet(currentTail, headNext);
                    
                } else {
                    // Read the value BEFORE the CAS (another thread might 
                    // dequeue and free this node right after our CAS)
                    T value = headNext.value;
                    
                    // Try to advance head to the next node
                    if (head.compareAndSet(currentHead, headNext)) {
                        return value;  // Dequeue complete
                    }
                }
            }
        }
    }
    
    public boolean isEmpty() {
        Node<T> currentHead = head.get();
        return currentHead.next.get() == null;
    }
}
```

This is more complex than the Treiber Stack. Let's break down the tricky parts.

* * *

## Enqueue: Step by Step

The enqueue has two phases:

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/32f16182-e9d3-4085-b11d-14baa8dd24de.png align="center")

### The Two-Step Protocol

Enqueue is **not** a single CAS. It requires **two** modifications:

1.  **Link** the new node after the current tail (`tail.next = newNode`)
    
2.  **Swing** the tail pointer forward (`tail = newNode`)
    

These two steps are separate CAS operations, which means there's a brief moment when the tail pointer is "lagging behind" — it still points to the old last node even though a new node has been linked.

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/460c8565-802d-4845-afaf-06d7106f9b59.png align="center")

### The "Helping" Mechanism

This is the brilliant part. If Thread 2 starts an enqueue and finds that `tail.next` is not null, it means Thread 1 is between step 1 and step 2. Instead of waiting, Thread 2 **helps** by advancing the tail pointer for Thread 1!

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/811860a0-04a5-4f65-8a7f-71cfee7a0c04.png align="center")

This **helping** pattern is what makes the algorithm lock-free. No thread ever waits — if you find someone else mid-operation, you help them finish and then proceed with your own work.

* * *

## Dequeue: Step by Step

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/a4f0378f-18a9-4b9a-b3d0-9d8effbf4fc7.png align="center")

### Why Read the Value BEFORE the CAS?

This is a subtle but critical detail:

```java
T value = headNext.value;                       // Read FIRST
if (head.compareAndSet(currentHead, headNext)) { // CAS SECOND
    return value;
}
```

We read `headNext.value` **before** the CAS because once the CAS succeeds, `headNext` becomes the new sentinel node. Another thread might immediately dequeue and overwrite it. By reading the value first, we ensure we have a copy regardless of what other threads do.

* * *

## Concurrent Enqueue and Dequeue

Here's a full scenario with three threads operating simultaneously:

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/2d061bc4-d01c-4ab3-949a-cbca9e2a6289.png align="center")

Everything works correctly despite all three threads operating at the same time!

* * *

## Testing the Michael-Scott Queue

```java
import java.util.*;
import java.util.concurrent.*;
import java.util.concurrent.atomic.AtomicInteger;

public class MichaelScottQueueTest {
    
    public static void main(String[] args) throws Exception {
        MichaelScottQueue<Integer> queue = new MichaelScottQueue<>();
        int numProducers = 4;
        int numConsumers = 4;
        int itemsPerProducer = 100_000;
        
        Set<Integer> produced = ConcurrentHashMap.newKeySet();
        Set<Integer> consumed = ConcurrentHashMap.newKeySet();
        AtomicInteger produceCount = new AtomicInteger(0);
        AtomicInteger consumeCount = new AtomicInteger(0);
        
        ExecutorService executor = Executors.newFixedThreadPool(
            numProducers + numConsumers
        );
        
        // Start producers
        CountDownLatch producersDone = new CountDownLatch(numProducers);
        for (int p = 0; p < numProducers; p++) {
            final int producerId = p;
            executor.submit(() -> {
                for (int i = 0; i < itemsPerProducer; i++) {
                    int item = producerId * itemsPerProducer + i;
                    queue.enqueue(item);
                    produced.add(item);
                    produceCount.incrementAndGet();
                }
                producersDone.countDown();
            });
        }
        
        // Wait for all producers to finish
        producersDone.await();
        
        // Start consumers — drain the queue
        CountDownLatch consumersDone = new CountDownLatch(numConsumers);
        for (int c = 0; c < numConsumers; c++) {
            executor.submit(() -> {
                Integer item;
                while ((item = queue.dequeue()) != null) {
                    if (!consumed.add(item)) {
                        System.err.println("DUPLICATE consumed: " + item);
                    }
                    consumeCount.incrementAndGet();
                }
                consumersDone.countDown();
            });
        }
        
        consumersDone.await();
        executor.shutdown();
        
        int expected = numProducers * itemsPerProducer;
        System.out.printf("Produced: %d, Consumed: %d%n", 
            produceCount.get(), consumeCount.get());
        System.out.printf("Unique produced: %d, Unique consumed: %d%n",
            produced.size(), consumed.size());
        System.out.println(consumed.equals(produced) ? "✅ PASS" : "❌ FAIL");
    }
}
```

* * *

## Treiber Stack vs Michael-Scott Queue

| Feature | Treiber Stack | Michael-Scott Queue |
| --- | --- | --- |
| **Data structure** | LIFO (Last-In, First-Out) | FIFO (First-In, First-Out) |
| **Atomic pointers** | 1 (head) | 2 (head + tail) |
| **Nodes** | Fully immutable | Mutable `next` pointer (AtomicReference) |
| **Sentinel node** | Not needed | Required (simplifies empty queue) |
| **Helping** | Not needed | Yes — threads help advance lagging tail |
| **CAS operations per op** | 1 | 1-2 |
| **Complexity** | Simple | Moderate |
| **Real-world usage** | Less common | `ConcurrentLinkedQueue` is based on this |

* * *

## Why the Helping Pattern Matters

The helping mechanism is what elevates this from a clever hack to a principled design:

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/af49f13d-37ff-4ada-b4ec-5bd419f18d6f.png align="center")

This "helping" principle appears across many lock-free algorithms:

> **If you find the data structure in an inconsistent state, fix it yourself and continue.**

* * *

## Key Takeaways

1.  The **Michael-Scott Queue** uses a linked list with CAS on both **head** and **tail** pointers
    
2.  A **sentinel (dummy) node** simplifies empty-queue handling
    
3.  **Enqueue** is a two-step operation: link the new node, then swing the tail
    
4.  The **helping mechanism** ensures lock-freedom: if tail is lagging, any thread can advance it
    
5.  **Read values before CAS** in dequeue — after CAS, the node becomes the new sentinel
    
6.  Java's `ConcurrentLinkedQueue` is based on this algorithm
