# Deep Dive (java.util) - ArrayList

> *"The simplest data structure is often the fastest — if you understand what it's doing underneath."*

Welcome to the first deep dive. We're going to tear apart `java.util.ArrayList` — the most commonly used collection in Java. By the end, you'll understand not just how to call `.add()` and `.get()`, but exactly what the JVM is doing with memory when you do.

* * *

## What is ArrayList, Really?

At its core, `ArrayList` is a **resizable array**. That's it. It wraps a regular Java `Object[]` array and provides dynamic resizing when the array fills up.

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/6c6c9738-624f-4461-8dea-72d461eac546.png align="center")

The `RandomAccess` marker interface is important — it tells algorithms (like `Collections.sort()`) that this list supports **constant-time random access** via index. This changes how algorithms choose to iterate.

* * *

## The Backing Array

Here's what `ArrayList` looks like inside the JVM:

```java
public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable {
    
    // The actual storage — a plain Object array
    transient Object[] elementData;
    
    // The number of elements actually stored (NOT the array length)
    private int size;
}
```

**Critical distinction:** `size` ≠ `elementData.length`.

*   `size` = how many elements you've added
    
*   `elementData.length` = the **capacity** (how many it *can* hold before resizing)
    

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/73bad5a5-7596-47ba-b1fe-4057b0f0168e.png align="center")

The blue slots contain our data. The dark slots are **allocated but empty** — wasted memory that exists so we don't have to resize on every single `add()`.

* * *

## How add() Works — Amortized Growth

This is the most important mechanism to understand. When you call `list.add(element)`, here's what happens:

```java
// Simplified from OpenJDK source
public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Step 1: Make sure there's room
    elementData[size++] = e;           // Step 2: Store element, increment size
    return true;
}
```

### Step 1: Capacity Check

If `size + 1 > elementData.length`, the array is full. Time to grow.

### Step 2: The Growth Algorithm

```java
private void grow(int minCapacity) {
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1); // Grow by 50%
    if (newCapacity < minCapacity)
        newCapacity = minCapacity;
    elementData = Arrays.copyOf(elementData, newCapacity);
}
```

The key formula: **newCapacity = oldCapacity × 1.5** (using bit shift `>> 1` for integer division by 2).

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/b44e74b7-69b0-482f-a653-4eb092372b7d.png align="center")

### What `Arrays.copyOf` Does Under the Hood

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/c76c0d46-8d43-4314-a12c-099566ff065d.png align="center")

`System.arraycopy()` is a **native method** — it drops down to the operating system's `memcpy` or `memmove`, which can use SIMD instructions for bulk memory copying. It's much faster than a Java `for` loop.

### Amortized O(1)

Even though resizing is O(n), it happens so infrequently that the **amortized** cost of `add()` is O(1). Here's the math:

*   Adding 10 elements: 0 copies (fits initial capacity)
    
*   Adding element 11: 10 copies (resize to 15)
    
*   Adding elements 12–15: 0 copies
    
*   Adding element 16: 15 copies (resize to 22)
    
*   Total copies after 22 adds: 10 + 15 = 25 copies for 22 elements ≈ 1.1 copies/element
    

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/01718f38-be9b-49c3-91f6-2d2260130e87.png align="center")

* * *

## How get() and set() Work — O(1) Access

This is where ArrayList shines:

```java
public E get(int index) {
    rangeCheck(index);          // throws IndexOutOfBoundsException if invalid
    return elementData(index);  // return elementData[index] — that's IT
}

public E set(int index, E element) {
    rangeCheck(index);
    E oldValue = elementData(index);
    elementData[index] = element;
    return oldValue;
}
```

It's a **single array access**. Arrays in Java are contiguous blocks of memory, so accessing `elementData[index]` is just:

```plaintext
address = baseAddress + (index × referenceSize)
```

On a 64-bit JVM with compressed oops (default for heaps < 32GB):

*   `referenceSize` = 4 bytes
    
*   `elementData[5]` = `baseAddress + 20 bytes`
    

That's one addition and one memory read. **O(1)** — constant time, regardless of list size.

* * *

## How remove() Works — The Expensive Shift

This is where ArrayList pays a price:

```java
public E remove(int index) {
    rangeCheck(index);
    modCount++;
    
    E oldValue = elementData(index);
    
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index, numMoved);
    
    elementData[--size] = null; // Let GC collect the removed element
    return oldValue;
}
```

**Every element after the removed one must shift left by one position.**

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/c3ff27f4-334a-400e-90bf-f3ffcfb95170.png align="center")

**Time complexity:**

*   `remove(0)` — worst case: O(n), shifts ALL elements
    
*   `remove(size-1)` — best case: O(1), no shifting needed
    
*   Average: O(n/2) = **O(n)**
    

> **Pro tip:** If you're removing elements while iterating, iterate **backwards** to avoid shifting issues and index corruption.

* * *

## Memory Layout & Cache Locality

Here's why ArrayList is often faster than LinkedList even for operations where LinkedList has better theoretical complexity:

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/030dc281-0d29-43d5-9c9e-dc0f1116d60a.png align="center")

The `Object[]` array stores **references** (pointers) contiguously. When the CPU reads one reference, the entire cache line (typically 64 bytes) is loaded, bringing neighboring references into L1 cache for free.

On a compressed-oops JVM (4-byte references), one 64-byte cache line holds **16 references**. Iterating through an ArrayList touches very few cache lines.

> **Key insight:** The *objects themselves* may be scattered in the heap, but the *references to them* are packed together. This matters enormously for iteration speed.

* * *

## Iterator and ConcurrentModificationException

ArrayList uses a **fail-fast** iterator:

```java
private class Itr implements Iterator<E> {
    int cursor;       // next element to return
    int lastRet = -1; // last element returned
    int expectedModCount = modCount;  // snapshot of structural modification count
    
    public E next() {
        checkForComodification();  // Throws if list was modified!
        // ...
    }
    
    final void checkForComodification() {
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
    }
}
```

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/6a3b5fc2-74df-49a5-9ad3-433261a738b5.png align="center")

**Safe removal during iteration:**

```java
// ❌ WRONG — throws ConcurrentModificationException
for (String s : list) {
    if (s.equals("remove-me")) {
        list.remove(s);  // Modifies list during iteration!
    }
}

// ✅ CORRECT — use Iterator.remove()
Iterator<String> it = list.iterator();
while (it.hasNext()) {
    if (it.next().equals("remove-me")) {
        it.remove();  // Safe — modCount is updated in the iterator too
    }
}

// ✅ ALSO CORRECT — use removeIf (Java 8+)
list.removeIf(s -> s.equals("remove-me"));
```

* * *

## Sizing & Capacity Management

### Default Capacity Behavior

```java
// Default constructor — starts with EMPTY array (not 10!)
ArrayList<String> list1 = new ArrayList<>();
// elementData = {} (length 0)
// On first add(), expands to DEFAULT_CAPACITY = 10

// Pre-sized — avoids early resizing
ArrayList<String> list2 = new ArrayList<>(1000);
// elementData = new Object[1000]
```

> **Since Java 8:** The default constructor uses an empty array `{}` and only allocates the default capacity of 10 on the first `add()`. This saves memory for lists that are never populated.

### `trimToSize()` — Reclaim Wasted Memory

```java
ArrayList<String> list = new ArrayList<>(1000); // capacity=1000
list.add("only");
list.add("two");
list.add("items");
// size=3, capacity=1000 — wasting 997 slots!

list.trimToSize();
// size=3, capacity=3 — array shrunk to fit
```

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/43dcc1a6-caf5-475b-a484-fd4bbc6f2746.png align="center")

### `ensureCapacity()` — Pre-allocate for Bulk Inserts

```java
ArrayList<String> list = new ArrayList<>();
list.ensureCapacity(50_000); // Pre-allocate before bulk insert

for (int i = 0; i < 50_000; i++) {
    list.add("item-" + i);  // No resizing needed!
}
```

* * *

## ArrayList vs Arrays

| Aspect | `Object[]` Array | `ArrayList<E>` |
| --- | --- | --- |
| Size | Fixed at creation | Dynamic (auto-resizes) |
| Type safety | No generics (pre-Java) | Full generics support |
| Primitives | ✅ `int[]`, `double[]` etc. | ❌ Must use `Integer`, `Double` (boxing) |
| Memory overhead | None | ~40 bytes for ArrayList object + unused capacity |
| API | Length only (`arr.length`) | Rich API (sort, search, stream, etc.) |
| Thread safety | None | None (use `Collections.synchronizedList()`) |
| Null elements | ✅ Allowed | ✅ Allowed |

> **Performance tip:** If you know the exact size and are working with primitives, use a raw array. The boxing/unboxing overhead of `ArrayList<Integer>` vs `int[]` is significant in hot loops.

* * *

## Common Pitfalls

### Pitfall 1: Creating from `Arrays.asList()`

```java
// ❌ TRAP: This returns a FIXED-SIZE list, not a real ArrayList!
List<String> fixed = Arrays.asList("a", "b", "c");
fixed.add("d");    // 💥 UnsupportedOperationException!
fixed.set(0, "z"); // ✅ This works — modification is OK, just not resizing

// ✅ FIX: Wrap in a real ArrayList
List<String> mutable = new ArrayList<>(Arrays.asList("a", "b", "c"));
mutable.add("d");  // ✅ Works

// ✅ Java 9+ alternative
List<String> immutable = List.of("a", "b", "c");  // Truly immutable
```

### Pitfall 2: Forgetting Boxing Cost

```java
// ❌ SLOW: Each int is autoboxed to Integer object
ArrayList<Integer> list = new ArrayList<>();
for (int i = 0; i < 1_000_000; i++) {
    list.add(i);  // autoboxing: Integer.valueOf(i) called each time
}

// Memory: 1M Integer objects × ~16 bytes each = ~16 MB of objects
// Plus: 1M references in Object[] = ~4 MB
// Total: ~20 MB for 1M ints

// ✅ BETTER: Use int[] if you just need storage
int[] array = new int[1_000_000];
// Memory: ~4 MB total (1M × 4 bytes)
```

### Pitfall 3: Removing in a Forward Loop

```java
// ❌ BUG: Skips elements after removal
List<String> list = new ArrayList<>(List.of("a", "b", "b", "c"));
for (int i = 0; i < list.size(); i++) {
    if (list.get(i).equals("b")) {
        list.remove(i);  // After removing index 1, "b" at index 2 shifts to index 1
                         // Loop increments i to 2, skipping the second "b"!
    }
}
// Result: ["a", "b", "c"] — missed one!

// ✅ FIX: Iterate backwards
for (int i = list.size() - 1; i >= 0; i--) {
    if (list.get(i).equals("b")) {
        list.remove(i);
    }
}
```

* * *

## When to Use ArrayList

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/67ed4655-75d4-4564-9e11-f6428dbcb0af.png align="center")

**Use ArrayList when:**

*   You need indexed access (`get(i)`)
    
*   You mostly add to the end (append)
    
*   You iterate sequentially (cache-friendly)
    
*   You don't know the size upfront
    

**Avoid ArrayList when:**

*   You frequently insert/remove at the beginning (O(n) shifting)
    
*   You need thread-safe operations (use `CopyOnWriteArrayList` or `Collections.synchronizedList`)
    
*   You're storing millions of primitives (use raw arrays or specialized libraries)
    

* * *

## Summary

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/e76d1531-f230-4d4f-a5f0-082266ef913d.png align="center")

**The one-liner:** ArrayList is a dynamically-resized `Object[]` with O(1) random access, amortized O(1) appends, but O(n) inserts and removals that require shifting.
