Please note this post may change in the future, to fix mistakes and add new materials.

We all know the importance of Java GC, but the huge number of writings on the topic can be difficult to grasp, so I've written this post to help sort out some of the confusion (especially in terminology).

Note that you may need to have some basic understanding of how GC works.  If not, these two links provide a good introduction:

The Generational Heap

Generation
What’s on it
Young gen (aka. New gen)
The pool from which memory is initially allocated for most objects.
Old gen (aka. Tenured space)
The pool containing objects that have existed for some time in the survivor space.
Perm gen
The pool containing all the reflective data of the virtual machine itself, such as class and method objects.
Code cache
Memory that is used for compilation and storage of native code.
HotSpot Memory Layout

heap = young gen (aka. new gen) + old gen (aka. tenured space) 
young gen = eden space + survivor space 1 + survivor space 2

One important point to bear in mind:
Almost all objects on heap will die (apart from application scoped objects), either when they're young or old, so the overall used heap size should be related to the rate of object creation and their lifetime only, not how long the application has run (unless there is a memory leak).

The Collectors

This link (also this one) provides a good overview of the types of collectors as well as their combination in HotSpot.  

This table summarises all collectors (apart from G1 collector):

Compaction Collector
Non Compaction Collector
Serial (Single thread)
Young gen:
Serial collector (aka. Copy collector) (-XX:+UseSerialGC)

Old gen:
Mark Sweep Compact (aka. Serial old collector, MSC) (-XX:+UseSerialGC), note this is the default old GC

None
Parallel (Multi thread)
Young gen:
Parallel collector (aka. Parallel scavenge, PS) (-XX:+UseParallelGC), only works with serial/parallel old collector

Parallel new collector (-XX:+UseParNewGC), mainly works with CMS, but also works with MSC (because of full GC)

Old gen:
Parallel Scavenge Mark Sweep Compact (aka. Parallel MSC) (-XX:+UseParallelOldGC), note this will force -XX:+UseParallelGC

None
Concurrent
None
Old gen:
Concurrent Mark Sweep (-XX:+UseConcMarkSweepGC), note this only works with -XX:+UseParNewGC or -XX:-UseParNewGC (i.e., use serial collector)

Let me just highlight some important points:
  1. A minor collection happens on young gen only, while a major collection (aka. full GC) on the entire heap.  
  2. A major collection is triggered when old gen is full.  It collects the old gen first, then the young gen, promoting qualified objects to old gen.
  3. Stop-the-world events are not bad.  All collector pauses are stop-the-world (whether it's a young  gen collector or an old gen one), in order to calculate the GC roots and possibly compacting the heap or copying objects (to survivor spaces).  What is bad is a major collection that stops the world for too long, or lots of frequent minor collections that add up to long pause time.
  4. A full GC is slow because it involves compacting the heap (i.e., relocating live objects in old gen so they are close together).
  5. CMS (concurrent-mark-sweep) collector tries to avoid full GC (hence better response time) at the cost of throughput (because the collector runs concurrently with the application threads).  When a full GC is inevitable (either old gen is full, overly fragmented, or it cannot catch up with object creation rate), it falls back to the parallel collector to compact the heap (just once).
  6. A rule of thumb: parallel GC for throughput, concurrent GC for response time.

Ways to Tune

There are two main ways to tune a JVM, as documented in Java SE 6 HotSpot GC tuning
1. Explicitly setting the heap and off-heap parameters

2. Using ergonomics (only works for parallel collectors)

Also see here for a list of all JVM options in Java 6.

How to Monitor GC

VisualVM or JConsole are the most used tools for visualising GC. Alternatively, use the following command prints the GC information:

jstat -gcutil <prid> <period in ms>


References

  1. Generational heap: http://stackoverflow.com/a/1262474/842860
  2. Java Garbage Collection Basics: http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html
  3. Java SE 6 HotSpot GC tuning: http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html
  4. http://www.infoq.com/articles/Java_Garbage_Collection_Distilled
  5. http://javabook.compuware.com/content/memory/how-garbage-collection-works.aspx
  6. http://www.fasterj.com/articles/oraclecollectors1.shtml
  7. Minor GC, with a list of all possible collector options: http://blog.griddynamics.com/2011/06/understanding-gc-pauses-in-jvm-hotspots.html
  8. CMS: http://blog.griddynamics.com/2011/06/understanding-gc-pauses-in-jvm-hotspots_02.html
  9. TLAB: http://stackoverflow.com/a/25515423/842860
  10. List of JVM options for 1.6: http://stas-blogspot.blogspot.co.uk/2011/07/most-complete-list-of-xx-options-for.html

No comments:

Post a Comment