3

I have just noticed that because of one action(that i am trying to find out) virtual machine stucked few times(looks like Garbage Collector stopped the world). Normally small GC takes 0.05 second and suddently something happened, i can see in logs that it increased to even 15 seconds per one small GC. Virtual Machine was unstable for about 10 minutes.

Is there any way to find out the cause in source code(except asking users what they were doing in that moment)?

Virtual Machine is run in dedicated machine(Linux OS) and i have got access to it only remotely. Total memory used by the process is 6 gb(in the stable moment) so it takes a lot of time to create memory snapshot

  • If you haven't seen this already, you might be interested in this thread. It gives an example of debugging the VM: http://stackoverflow.com/a/1849365/751245 – Klazen108 Nov 19 '13 at 15:23
  • Have you tried using Concurrent Mark-Sweep GC with a parallel new generation? – Rogue Nov 19 '13 at 15:40
  • I am using such arguments: -Xmx16G -XX:PermSize=2G -XX:+UseConcMarkSweepGC -XX:+UseTLAB -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:./gc.log –  Nov 19 '13 at 16:34

2 Answers2

1

15 seconds sounds excessive under and circumstances. You should check so that you aren't simply swapping (especially during the GC itself), which could explain the slow GC times.

Let vmstat 5 or similar run, and check the si and so fields. If they aren't zero, you are swapping.

Dolda2000
  • 25,216
  • 4
  • 51
  • 92
1

Try collecting the thread dumps before and after that YoungGen spike. Also, you could try collecting class histograms at the same time. If it is a FullGC that is spiking, you can use VM flags that will collect class historgrams before/after each FullGC.

jstat will give you some more insights into the heap regions occupancy (in comparison to GC logs), try collect that as well.

A profiler could be also attached, use the sampling mode - e.g. Mission Control, has a low overhead and should give you some insight into what is happening in the VM during that GC spike.

Collecting iostat, netstat, vmstat, etc. will help you rule out any interference from outside of the VM.

Aleš
  • 8,896
  • 8
  • 62
  • 107