An entry in Dr. Dobb's Journal, "G1: Java's Garbage First Garbage Collector," discusses Garbage First (G1), a much-anticipated feature newly introduced with the release of Java SE 6 Update 14. G1 works to reclaim areas of memory within an application that will never be accessed again.
According to the article, G1 is a low-pause, low-latency, sometimes soft real-time collector that allows users to set max pause time goals and collection intervals through suggestions on the Java VM command line. G1 will attempt to meet goals and, as a result, introduce as little latency as possible into an application. Therefore, the VM runs more predictably as it attempts to meet the pause time goals users provide.
Author Eric J. Bruno describes G1 as a server-style garbage collector, targeted for multi-processors with large memories, that meets a soft real-time goal with high probability. G1 achieves its goals by achieving high throughput, which distinguishes it from other real-time collectors.
Bruno writes that the G1 collector divides its work into multiple phases that operate on a heap broken down into equally sized regions. The absence of generations in the heap provides flexibility in how garbage collection is performed, which is adjusted on-the-fly according to the amount of processor time available to the collector.
The G1 collector works in stages that consist of remembered set (RS) maintenance, concurrent marking, and evacuation pauses. Bruno examines each in turn.
Remembered Set Maintenance
With RS maintenance, Bruno writes, each region maintains an associated subset of cards that have recently been written to. This is the Remembered Set (RS). These cards are placed in a region's RS via a write barrier, which is an efficient block of code that all mutator threads must execute when modifying an object reference. For a particular region, (say, region A) only cards that contain pointers from other regions to an object in region A are recorded in region A's RS, ignoring a region's internal references, as well as null references.
Bruno writes that each region's remembered set is implemented as a group of collections, with the dirty cards distributed amongst them according to the number of references contained within. Three levels of coarseness are maintained: sparse, fine, and coarse. It's broken up this way so that parallel GC threads can operate on one RS without contention, and can target the regions that will yield the most garbage. However, it's best to think of the RS as one logical set of dirty cards.
Concurrent Marking
At the concurrent marking stage G1 identifies live data objects per region and maintains the pointer to the next free byte, called "top." Small stop-the-world pauses occur to ensure the correct heap state, Bruno notes. A marking bitmap is maintained to create a summary view of the live objects within the heap. Each bit in the bitmap corresponds to one word within the heap (an area large enough to contain an object pointer). A bit in the bitmap is set when the object it represents is determined to be a live object. In reality there are two bitmaps: one for the current collection, and a second for the previously completed collection. This is one way that changes to the heap are tracked over time, Bruno explains.
The author next details the steps in each of the three stages, the marking stage, the re-marking stage and the cleanup stage.
Evacuation Pauses
Finally, turning to the evacuation and collection stage, Bruno calls this the heart of G1's operation, the stage which involves reclaiming dead objects and shaping the heap for efficient object allocation. The collection set of regions from the Concurrent Marking process forms a subset of the heap that is used during this process. When evacuation begins, all mutator threads are paused, and live objects are moved from their respective regions and compacted into other regions. Although other garbage collectors might perform compaction concurrently with mutator threads, the writer suggests, it's far more efficient to pause them. Since this operation is only performed on a portion of the heap -- it compacts only the collection set of regions -- it's a relatively quick, low-pause, operation. Once this phase completes, the GC cycle is complete.
Bruno adds details on how parallelization involving evacuation of multiple GC threads limits total pause time.
Finally, the article provides code for enabling G1 from Java SE6 Update 14, as well as code for tuning the solution.
More Information
Under the Hood with Garbage First
Java News Bites: Garbage-First Garbage Collector
[...read more...]