Deep Dive (java.util) - ArrayList
What really happens under the hood
"The simplest data structure is often the fastest — if you understand what it's doing underneath."
Welcome to the first deep dive. We're going to tear apart java.util.ArrayList — the most commonly used collection in Java. By the end, you'll understand not just how to call .add() and .get(), but exactly what the JVM is doing with memory when you do.
What is ArrayList, Really?
At its core, ArrayList is a resizable array. That's it. It wraps a regular Java Object[] array and provides dynamic resizing when the array fills up.
The RandomAccess marker interface is important — it tells algorithms (like Collections.sort()) that this list supports constant-time random access via index. This changes how algorithms choose to iterate.
The Backing Array
Here's what ArrayList looks like inside the JVM:
public class ArrayList<E> extends AbstractList<E>
implements List<E>, RandomAccess, Cloneable, java.io.Serializable {
// The actual storage — a plain Object array
transient Object[] elementData;
// The number of elements actually stored (NOT the array length)
private int size;
}
Critical distinction: size ≠ elementData.length.
size= how many elements you've addedelementData.length= the capacity (how many it can hold before resizing)
The blue slots contain our data. The dark slots are allocated but empty — wasted memory that exists so we don't have to resize on every single add().
How add() Works — Amortized Growth
This is the most important mechanism to understand. When you call list.add(element), here's what happens:
// Simplified from OpenJDK source
public boolean add(E e) {
ensureCapacityInternal(size + 1); // Step 1: Make sure there's room
elementData[size++] = e; // Step 2: Store element, increment size
return true;
}
Step 1: Capacity Check
If size + 1 > elementData.length, the array is full. Time to grow.
Step 2: The Growth Algorithm
private void grow(int minCapacity) {
int oldCapacity = elementData.length;
int newCapacity = oldCapacity + (oldCapacity >> 1); // Grow by 50%
if (newCapacity < minCapacity)
newCapacity = minCapacity;
elementData = Arrays.copyOf(elementData, newCapacity);
}
The key formula: newCapacity = oldCapacity × 1.5 (using bit shift >> 1 for integer division by 2).
What Arrays.copyOf Does Under the Hood
System.arraycopy() is a native method — it drops down to the operating system's memcpy or memmove, which can use SIMD instructions for bulk memory copying. It's much faster than a Java for loop.
Amortized O(1)
Even though resizing is O(n), it happens so infrequently that the amortized cost of add() is O(1). Here's the math:
Adding 10 elements: 0 copies (fits initial capacity)
Adding element 11: 10 copies (resize to 15)
Adding elements 12–15: 0 copies
Adding element 16: 15 copies (resize to 22)
Total copies after 22 adds: 10 + 15 = 25 copies for 22 elements ≈ 1.1 copies/element
How get() and set() Work — O(1) Access
This is where ArrayList shines:
public E get(int index) {
rangeCheck(index); // throws IndexOutOfBoundsException if invalid
return elementData(index); // return elementData[index] — that's IT
}
public E set(int index, E element) {
rangeCheck(index);
E oldValue = elementData(index);
elementData[index] = element;
return oldValue;
}
It's a single array access. Arrays in Java are contiguous blocks of memory, so accessing elementData[index] is just:
address = baseAddress + (index × referenceSize)
On a 64-bit JVM with compressed oops (default for heaps < 32GB):
referenceSize= 4 byteselementData[5]=baseAddress + 20 bytes
That's one addition and one memory read. O(1) — constant time, regardless of list size.
How remove() Works — The Expensive Shift
This is where ArrayList pays a price:
public E remove(int index) {
rangeCheck(index);
modCount++;
E oldValue = elementData(index);
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index, numMoved);
elementData[--size] = null; // Let GC collect the removed element
return oldValue;
}
Every element after the removed one must shift left by one position.
Time complexity:
remove(0)— worst case: O(n), shifts ALL elementsremove(size-1)— best case: O(1), no shifting neededAverage: O(n/2) = O(n)
Pro tip: If you're removing elements while iterating, iterate backwards to avoid shifting issues and index corruption.
Memory Layout & Cache Locality
Here's why ArrayList is often faster than LinkedList even for operations where LinkedList has better theoretical complexity:
The Object[] array stores references (pointers) contiguously. When the CPU reads one reference, the entire cache line (typically 64 bytes) is loaded, bringing neighboring references into L1 cache for free.
On a compressed-oops JVM (4-byte references), one 64-byte cache line holds 16 references. Iterating through an ArrayList touches very few cache lines.
Key insight: The objects themselves may be scattered in the heap, but the references to them are packed together. This matters enormously for iteration speed.
Iterator and ConcurrentModificationException
ArrayList uses a fail-fast iterator:
private class Itr implements Iterator<E> {
int cursor; // next element to return
int lastRet = -1; // last element returned
int expectedModCount = modCount; // snapshot of structural modification count
public E next() {
checkForComodification(); // Throws if list was modified!
// ...
}
final void checkForComodification() {
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
}
}
Safe removal during iteration:
// ❌ WRONG — throws ConcurrentModificationException
for (String s : list) {
if (s.equals("remove-me")) {
list.remove(s); // Modifies list during iteration!
}
}
// ✅ CORRECT — use Iterator.remove()
Iterator<String> it = list.iterator();
while (it.hasNext()) {
if (it.next().equals("remove-me")) {
it.remove(); // Safe — modCount is updated in the iterator too
}
}
// ✅ ALSO CORRECT — use removeIf (Java 8+)
list.removeIf(s -> s.equals("remove-me"));
Sizing & Capacity Management
Default Capacity Behavior
// Default constructor — starts with EMPTY array (not 10!)
ArrayList<String> list1 = new ArrayList<>();
// elementData = {} (length 0)
// On first add(), expands to DEFAULT_CAPACITY = 10
// Pre-sized — avoids early resizing
ArrayList<String> list2 = new ArrayList<>(1000);
// elementData = new Object[1000]
Since Java 8: The default constructor uses an empty array
{}and only allocates the default capacity of 10 on the firstadd(). This saves memory for lists that are never populated.
trimToSize() — Reclaim Wasted Memory
ArrayList<String> list = new ArrayList<>(1000); // capacity=1000
list.add("only");
list.add("two");
list.add("items");
// size=3, capacity=1000 — wasting 997 slots!
list.trimToSize();
// size=3, capacity=3 — array shrunk to fit
ensureCapacity() — Pre-allocate for Bulk Inserts
ArrayList<String> list = new ArrayList<>();
list.ensureCapacity(50_000); // Pre-allocate before bulk insert
for (int i = 0; i < 50_000; i++) {
list.add("item-" + i); // No resizing needed!
}
ArrayList vs Arrays
| Aspect | Object[] Array |
ArrayList<E> |
|---|---|---|
| Size | Fixed at creation | Dynamic (auto-resizes) |
| Type safety | No generics (pre-Java) | Full generics support |
| Primitives | ✅ int[], double[] etc. |
❌ Must use Integer, Double (boxing) |
| Memory overhead | None | ~40 bytes for ArrayList object + unused capacity |
| API | Length only (arr.length) |
Rich API (sort, search, stream, etc.) |
| Thread safety | None | None (use Collections.synchronizedList()) |
| Null elements | ✅ Allowed | ✅ Allowed |
Performance tip: If you know the exact size and are working with primitives, use a raw array. The boxing/unboxing overhead of
ArrayList<Integer>vsint[]is significant in hot loops.
Common Pitfalls
Pitfall 1: Creating from Arrays.asList()
// ❌ TRAP: This returns a FIXED-SIZE list, not a real ArrayList!
List<String> fixed = Arrays.asList("a", "b", "c");
fixed.add("d"); // 💥 UnsupportedOperationException!
fixed.set(0, "z"); // ✅ This works — modification is OK, just not resizing
// ✅ FIX: Wrap in a real ArrayList
List<String> mutable = new ArrayList<>(Arrays.asList("a", "b", "c"));
mutable.add("d"); // ✅ Works
// ✅ Java 9+ alternative
List<String> immutable = List.of("a", "b", "c"); // Truly immutable
Pitfall 2: Forgetting Boxing Cost
// ❌ SLOW: Each int is autoboxed to Integer object
ArrayList<Integer> list = new ArrayList<>();
for (int i = 0; i < 1_000_000; i++) {
list.add(i); // autoboxing: Integer.valueOf(i) called each time
}
// Memory: 1M Integer objects × ~16 bytes each = ~16 MB of objects
// Plus: 1M references in Object[] = ~4 MB
// Total: ~20 MB for 1M ints
// ✅ BETTER: Use int[] if you just need storage
int[] array = new int[1_000_000];
// Memory: ~4 MB total (1M × 4 bytes)
Pitfall 3: Removing in a Forward Loop
// ❌ BUG: Skips elements after removal
List<String> list = new ArrayList<>(List.of("a", "b", "b", "c"));
for (int i = 0; i < list.size(); i++) {
if (list.get(i).equals("b")) {
list.remove(i); // After removing index 1, "b" at index 2 shifts to index 1
// Loop increments i to 2, skipping the second "b"!
}
}
// Result: ["a", "b", "c"] — missed one!
// ✅ FIX: Iterate backwards
for (int i = list.size() - 1; i >= 0; i--) {
if (list.get(i).equals("b")) {
list.remove(i);
}
}
When to Use ArrayList
Use ArrayList when:
You need indexed access (
get(i))You mostly add to the end (append)
You iterate sequentially (cache-friendly)
You don't know the size upfront
Avoid ArrayList when:
You frequently insert/remove at the beginning (O(n) shifting)
You need thread-safe operations (use
CopyOnWriteArrayListorCollections.synchronizedList)You're storing millions of primitives (use raw arrays or specialized libraries)
Summary
The one-liner: ArrayList is a dynamically-resized Object[] with O(1) random access, amortized O(1) appends, but O(n) inserts and removals that require shifting.