# Java Memory Deep Dive 

## Core GC Concepts

### What is Garbage?

An object is garbage when it is **unreachable** from any GC Root. The JVM never uses reference counting (unlike Python/Swift).

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/7d4d5308-e553-4bcb-9410-4a5ec06fb8d3.png align="center")

> \[!IMPORTANT\] **Interview Key**: Objects D, E, F form a reference cycle but are STILL garbage because no GC root can reach them. Java GC handles cycles correctly — unlike naive reference counting.

### Reachability Analysis (Mark Phase)

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/8b137593-5de9-4c00-a828-63c8f50f72a9.png align="center")

### The Three Fundamental GC Operations

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/82627081-809c-4a4c-9dba-445301d1ad94.png align="center")

* * *

## GC Algorithm Strategies

### Mark-Sweep

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/9f68d71a-b830-4963-b853-9179aabb0c91.png align="center")

### 2.2 Mark-Sweep-Compact

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/9bed9c1a-0cd7-4512-a307-e82fc97c53f2.png align="center")

### 2.3 Copying Collector

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/8fe77e0e-9d46-4bb2-8ea4-2f215a526ce8.png align="center")

### 2.4 Algorithm Comparison

| Algorithm | Fragmentation | Speed | Memory Overhead | Used In |
| --- | --- | --- | --- | --- |
| **Mark-Sweep** | Yes | Fast sweep | None | CMS (old gen) |
| **Mark-Compact** | No | Slower (moving) | None | Serial, Parallel (old gen) |
| **Copying** | No | Fastest for low survival | 2x space (from/to) | All (young gen) |

* * *

## 3\. GC Types: Minor, Major, Full

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/a49dd8c6-b1c9-4e08-a0f6-231f4ece6b82.png align="center")

### 3.1 Minor GC Walkthrough

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/664db3b6-a487-4992-acda-8c71cf3c8ccb.png align="center")

### 3.2 Card Table and Remembered Sets

**Problem**: During Minor GC, we only scan Young Gen. But Old Gen objects might reference Young Gen objects. How do we find those references without scanning the entire Old Gen?

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/25fe8fa0-80c4-4210-8931-faf615fce703.png align="center")

> \[!TIP\] **Write Barrier**: A small piece of code injected by the JIT at every reference store. When you write `oldObj.field = youngObj`, the barrier marks the card as dirty. This is the cost of generational GC — every reference write has a small overhead.

* * *

## 4\. JVM Garbage Collectors

### 4.1 Collector Evolution Timeline

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/861a855d-a020-44ca-bd7a-35357ea35513.png align="center")

### 4.2 Serial GC (`-XX:+UseSerialGC`)

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/40f104c8-455d-4dc8-b4d0-a977e6cd5952.png align="center")

**Use case**: Client apps, small heaps under 100MB, single-CPU machines.

### 4.3 Parallel GC (`-XX:+UseParallelGC`)

Default in Java 8. Uses **multiple GC threads** for higher throughput.

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/7822c1a7-87b1-4e7b-85d9-57ce6862ff22.png align="center")

| Flag | Purpose |
| --- | --- |
| `-XX:ParallelGCThreads=N` | Number of GC threads |
| `-XX:MaxGCPauseMillis=200` | Target max pause |
| `-XX:GCTimeRatio=99` | 1% time in GC |

### 4.4 G1 GC (`-XX:+UseG1GC`)

**Default since Java 9**. The most important collector to understand for interviews.

#### G1 Region-Based Heap

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/8ae1b2f1-2fa2-4d65-9c1c-d7b52c68ce34.png align="center")

> \[!NOTE\] G1 divides the heap into **equal-sized regions** (1MB–32MB, auto-calculated). Any region can serve any role. This flexibility allows G1 to collect the **most garbage-filled regions first** — hence "Garbage First."

#### G1 Collection Phases

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/44fcc7a4-7062-4c95-b47f-da43957fca77.png align="center")

#### G1 Key Concepts

**IHOP (Initiating Heap Occupancy Percent)**: When Old Gen reaches this threshold, concurrent marking starts. Default: adaptive (starts ~45%).

**Evacuation**: G1 does not sweep in place — it copies live objects from collected regions to free regions.

**Mixed GC**: Collects both Young AND selected Old regions in a single STW pause.

**SATB (Snapshot-At-The-Beginning)**: G1 takes a logical snapshot of the object graph at the start of concurrent marking. Any new references created during marking are captured via write barriers.

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/a623b1f5-d712-4990-a312-3630f11c6ac9.png align="center")

#### G1 Tuning Flags

| Flag | Purpose | Default |
| --- | --- | --- |
| `-XX:MaxGCPauseMillis` | Target pause time | 200ms |
| `-XX:G1HeapRegionSize` | Region size | Auto (1-32MB) |
| `-XX:InitiatingHeapOccupancyPercent` | IHOP trigger | 45% (adaptive) |
| `-XX:G1MixedGCCountTarget` | Mixed GCs per cycle | 8 |
| `-XX:G1HeapWastePercent` | Stop mixed if waste below | 5% |

### 4.5 ZGC (`-XX:+UseZGC`)

Ultra-low latency collector. Pauses are **sub-millisecond** regardless of heap size.

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/498b7b82-04c0-4f04-b525-fa3300e224e7.png align="center")

**ZGC Key Properties:**

*   Pauses are O(1) — do NOT scale with heap or live set size
    
*   Supports multi-terabyte heaps
    
*   Concurrent relocation (compaction while app runs)
    
*   Uses colored pointers and load barriers
    
*   Generational ZGC (Java 21+) adds generations for better throughput
    

### 4.6 Collector Comparison

| Collector | Pauses | Throughput | Heap Size | Best For |
| --- | --- | --- | --- | --- |
| **Serial** | Long, single-thread | Low | Small | Embedded, client |
| **Parallel** | Medium, multi-thread | Highest | Medium-Large | Batch, throughput |
| **G1** | Predictable target | Good | Large | General purpose |
| **ZGC** | Sub-ms | Good | Any (TB scale) | Latency-critical |
| **Shenandoah** | Sub-ms | Good | Any | Latency-critical (RedHat) |

* * *

## 5\. Safepoints — How the JVM Pauses Threads

GC cannot pause threads at arbitrary points. Threads must reach a **safepoint** first.

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/bc3933ed-2295-4737-9196-61090a064519.png align="center")

> \[!WARNING\] **Time-To-Safepoint (TTSP)** can be a hidden latency source. A counted loop without a safepoint poll (e.g., `for(int i=0; i<1_000_000; i++)`) can delay GC start. Use `-XX:+UseCountedLoopSafepoints` (default since Java 17) to insert safepoints in counted loops.

* * *

## 6\. Reference Types and GC

Java provides four reference strengths that interact with GC:

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/8f0a0d76-a388-4407-b7a5-3b1ad2edb889.png align="center")

```java
// Strong - default
Object strong = new Object();

// Soft - cleared when memory is low
SoftReference<byte[]> cache = new SoftReference<>(new byte[1024*1024]);

// Weak - cleared at next GC regardless of memory
WeakReference<Object> weak = new WeakReference<>(new Object());
// WeakHashMap uses this for auto-expiring entries

// Phantom - for post-mortem cleanup (replaces finalize())
PhantomReference<Object> phantom = new PhantomReference<>(obj, referenceQueue);
```

* * *

## GC Logging and Analysis

### Enable GC Logging

```bash
# Java 9+ (Unified Logging)
java -Xlog:gc*:file=gc.log:time,uptime,level,tags -jar app.jar

# Key log tags
# gc           - basic GC events
# gc+heap      - heap before/after
# gc+phases    - GC phase timings
# gc+age       - tenuring age distribution
# gc+promotion - promotion details
```

### Key Metrics to Monitor

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/7645813c-4da7-46e1-a1d6-ce6be78e1175.png align="center")

* * *

## Common GC Problems and Solutions

### Memory Leak Pattern

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/8978f4d3-93f1-4d18-a3dd-81538263f313.png align="center")

**Common leak sources:**

*   Static collections that grow unbounded
    
*   Listener/callback registrations never removed
    
*   ThreadLocal variables not cleared
    
*   ClassLoader leaks (hot redeployment)
    
*   Unclosed resources holding references
    

### Premature Promotion

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/d725d42b-ba82-403e-9485-c1950d1353e5.png align="center")

### GC Thrashing

When the JVM spends more time in GC than running the application:

```plaintext
// JVM throws this when:
// - More than 98% of time is spent in GC
// - Less than 2% of heap is recovered
java.lang.OutOfMemoryError: GC overhead limit exceeded
```

* * *

## FAQs — Garbage Collection

**Q1: How does the JVM determine which objects are garbage?**

Reachability analysis from GC Roots (stack locals, static fields, active threads, JNI refs). Any object not reachable from a root is garbage. Java does NOT use reference counting, so circular references are handled correctly.

**Q2: What is the difference between Minor GC, Major GC, and Full GC?**

Minor GC collects Young Gen only (Eden + Survivors). Major GC collects Old Gen. Full GC collects the entire heap plus Metaspace. Minor GCs are fast (most objects die young). Full GC is the most expensive and should be minimized.

**Q3: Explain G1 GC in depth**

G1 divides the heap into equal-sized regions (1-32MB). Any region can be Eden, Survivor, Old, or Humongous. G1 runs Young GCs (evacuate Eden/Survivor regions) and triggers Concurrent Marking when heap reaches IHOP. After marking, it runs Mixed GCs that collect Young AND the most garbage-filled Old regions. G1 targets a configurable max pause time (default 200ms) by limiting how many regions it collects per pause.

**Q4: What is a safepoint and why does it matter?**

A safepoint is a point in executing code where a thread can be safely paused for GC. The JVM cannot stop threads at arbitrary points because object references might be in an inconsistent state. Threads check a safepoint flag at method returns, loop back-edges, and between bytecodes. Time-to-safepoint can be a hidden latency source if long-running loops lack safepoint polls.

**Q5: How would you diagnose a memory leak in production?**

1.  Enable GC logging (`-Xlog:gc*`) and watch if Old Gen baseline keeps rising after Full GCs.
    
2.  Take heap dumps (`jmap -dump:live,format=b,file=heap.hprof <pid>`).
    
3.  Analyze with Eclipse MAT or VisualVM — look at dominator tree and histogram.
    
4.  Check retained size by class to find what is holding memory.
    
5.  Look for GC root paths to leaked objects — the path shows what is preventing collection.
    
6.  Common culprits: static maps, unclosed resources, ThreadLocal, listener leaks.
    

**Q6: When would you choose ZGC over G1?**

ZGC when sub-millisecond pause times are critical regardless of heap size (financial trading, real-time systems). G1 is better for general-purpose workloads where 200ms pauses are acceptable. ZGC has slightly lower throughput than G1 due to load barrier overhead. Generational ZGC (Java 21+) closes the throughput gap significantly.

**Q7: What is the write barrier in G1 and why is it needed?**

G1 uses two write barriers: Pre-write barrier for SATB marking (captures old reference before overwrite) and post-write barrier for remembered sets (tracks cross-region references). Without these, G1 would need to scan the entire heap to find inter-region references during partial collection.

* * *

## GC Selection Decision Tree

![](https://cdn.hashnode.com/uploads/covers/637f189ed7d9bcd845996b4b/b432231e-9a93-48df-bfb0-7cdc366d3be8.png align="center")
