Skip to main content

Command Palette

Search for a command to run...

Deep Dive (java.util.concurrent) -ConcurrentHashMap

Lock Striping and Concurrency

Published
8 min read

"The map that lets multiple threads read and write simultaneously — without tearing itself apart."

ConcurrentHashMap is the crown jewel of java.util.concurrent. This deep dive covers its evolution from segmented locks (Java 7) to CAS + synchronized nodes (Java 8+), and explains how it achieves thread-safe O(1) operations without global locking.


Why Not Just Synchronize HashMap?

// Option 1: Collections.synchronizedMap — SLOW
Map<String, Integer> sync = Collections.synchronizedMap(new HashMap<>());
// Every method acquires the SAME lock:
// Thread A: sync.get("x") → LOCKED → Thread B waits for ANY operation
// Thread C: sync.put("y", 1) → waits for Thread A to finish get()!

// Option 2: ConcurrentHashMap — FAST
ConcurrentHashMap<String, Integer> chm = new ConcurrentHashMap<>();
// Thread A: chm.get("x") → NO LOCK (volatile read)
// Thread B: chm.put("y", 1) → locks only bucket for "y"
// Thread C: chm.put("z", 2) → locks only bucket for "z" (different bucket!)

Java 7: Segmented Locking (Historical)

In Java 7, ConcurrentHashMap used an array of Segments, each being a mini-HashMap with its own lock:

Limitations of Java 7 approach:

  • Fixed number of segments (default 16)

  • Each segment is a separate hash table → wasted memory

  • Resizing was per-segment, not global

  • size() required locking ALL segments


Java 8+: CAS + Per-Bucket Synchronization

Java 8 completely rewrote ConcurrentHashMap. The new design:

The Node Class

static class Node<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    volatile V val;      // volatile! Visible to reading threads without lock
    volatile Node<K,V> next; // volatile! Chain is visible without lock
}

The volatile keyword is critical — it means reading threads can see the latest values without acquiring any lock.


How put() Works — Thread-Safe Insert

Simplified Source

final V putVal(K key, V value, boolean onlyIfAbsent) {
    int hash = spread(key.hashCode());
    for (Node<K,V>[] tab = table;;) {
        Node<K,V> f; int n, i, fh;
        
        if (tab == null || (n = tab.length) == 0)
            tab = initTable();  // Lazy init with CAS
        
        else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
            // Empty bucket → CAS (no lock!)
            if (casTabAt(tab, i, null, new Node<K,V>(hash, key, value)))
                break;  // CAS won
            // CAS failed → retry loop
        }
        
        else if ((fh = f.hash) == MOVED)
            tab = helpTransfer(tab, f);  // Help ongoing resize!
        
        else {
            // Non-empty bucket → lock the first node
            synchronized (f) {
                if (tabAt(tab, i) == f) {  // Double-check under lock
                    // Walk chain or tree, insert/update
                }
            }
        }
    }
    addCount(1L, binCount);  // Update size with spread counter
    return null;
}

Key design decisions:

  1. CAS for empty buckets — no locking needed for the common case

  2. synchronized per bucket — only the first node of each bucket is the lock monitor

  3. Double-check after locking — the bucket might have been resized/moved

  4. Helping mechanism — if a resize is in progress, the thread helps!


How get() Works — Lock-Free Read

This is the beautiful part. get() never acquires any lock:

public V get(Object key) {
    Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
    int h = spread(key.hashCode());
    
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (e = tabAt(tab, (n - 1) & h)) != null) {  // volatile read
        
        if ((eh = e.hash) == h) {
            if ((ek = e.key) == key || (ek != null && key.equals(ek)))
                return e.val;  // volatile read of val
        }
        else if (eh < 0)
            return (p = e.find(h, key)) != null ? p.val : null;  // TreeBin
        
        while ((e = e.next) != null) {  // volatile read of next
            if (e.hash == h &&
                ((ek = e.key) == key || (ek != null && key.equals(ek))))
                return e.val;
        }
    }
    return null;
}

Why is this safe?

  1. table reference is volatile

  2. Node.val and Node.next are volatile

  3. tabAt() uses Unsafe.getObjectVolatile() for array element access

  4. Volatile reads establish happens-before relationships with writes


Concurrent Resizing — Helping Mechanism

When ConcurrentHashMap needs to resize, multiple threads can help transfer entries from the old table to the new table simultaneously:

// When a thread encounters hash == MOVED during put():
else if ((fh = f.hash) == MOVED)
    tab = helpTransfer(tab, f);  // Join the resize effort!

Each thread claims a range of buckets to transfer, preventing duplication. Once a bucket is transferred, a ForwardingNode is placed in the old table to redirect readers to the new table.


Counting Elements — The Spread Counter

size() in a concurrent map is tricky. ConcurrentHashMap uses a distributed counter (inspired by LongAdder):

// Instead of one volatile long size:
private transient volatile long baseCount;
private transient volatile CounterCell[] counterCells;

// Each thread increments a different cell → no contention!

How it works:

  1. On put(), try CAS on baseCount

  2. If CAS fails (contention), pick a random CounterCell and increment it

  3. size() sums baseCount + all CounterCell values

  4. The sum is approximate (reads are not atomic across cells)

Warning: size() returns a best-effort estimate in a concurrent environment. It may not reflect in-flight operations.


Java 8 added parallel bulk operations with a parallelismThreshold:

ConcurrentHashMap<String, Integer> map = new ConcurrentHashMap<>();
// ... populate ...

// Parallel forEach — threshold controls when to go parallel
map.forEach(1000, // parallelize if size > 1000
    (key, value) -> System.out.println(key + "=" + value)
);

// Parallel reduce — find max value
int max = map.reduceValues(1000,
    (a, b) -> Math.max(a, b)
);

// Parallel search — find first match
String found = map.search(1000,
    (key, value) -> value > 100 ? key : null
);

// Parallel with transformation
long sum = map.reduceValuesToLong(1000,
    value -> (long) value,  // transform
    0L,                     // identity
    Long::sum               // reduce
);

ConcurrentHashMap vs Alternatives

Feature ConcurrentHashMap synchronizedMap Hashtable
Lock granularity Per-bucket Entire map Entire map
Read locking None (volatile) Global lock Global lock
Null keys/values ❌ Neither ✅ Both ❌ Neither
ConcurrentModificationException ❌ Never ✅ Possible ✅ Possible
Bulk parallel ops ✅ forEach/reduce/search
Throughput (highly concurrent) 🟢 Excellent 🔴 Poor 🔴 Poor
Iterator semantics Weakly consistent Fail-fast Fail-fast
Legacy Java 5+ Java 2+ Java 1.0

Common Pitfalls

Pitfall 1: Null Keys and Values Not Allowed

// ❌ NullPointerException!
ConcurrentHashMap<String, String> map = new ConcurrentHashMap<>();
map.put(null, "value");  // NPE!
map.put("key", null);    // NPE!

// Why? Because get() returning null would be ambiguous:
// does the key not exist, or is the value null?
// In a concurrent context, this ambiguity is dangerous.

Pitfall 2: Compound Operations Are Not Atomic

// ❌ RACE CONDITION: check-then-act is not atomic
if (!map.containsKey("key")) {
    map.put("key", computeValue());  // Another thread could put between these two!
}

// ✅ Use atomic compound operations
map.putIfAbsent("key", computeValue());
map.computeIfAbsent("key", k -> computeValue());
map.compute("key", (k, v) -> v == null ? 1 : v + 1);
map.merge("key", 1, Integer::sum);

Pitfall 3: size() is Approximate

// ❌ Don't rely on exact size in concurrent code
if (map.size() == 0) {
    // Another thread might add an element right HERE
    doSomethingAssumingEmpty(); // WRONG!
}

// ✅ size() is best-effort. Use isEmpty() when possible
// (though it's also a snapshot, it's at least a single volatile read)

Pitfall 4: Iterators Are Weakly Consistent

// Iterators may or may not reflect concurrent modifications
for (Map.Entry<String, Integer> entry : map.entrySet()) {
    // This will NOT throw ConcurrentModificationException
    // But may not see entries added by other threads during iteration
    // And may or may not see entries removed by other threads
}

Summary

The one-liner: ConcurrentHashMap uses CAS for empty buckets, per-node synchronized blocks for chains, volatile reads for lock-free gets, and cooperative resizing — making it the only correct choice for concurrent Map access on the JVM.