5

I am running some CPU-intensive Clojure code from within Intellij Idea (I don't think that's important - it seems to just spawn a process). According to both htop and top, it is using all 4 cores (well, 2 + hyperthreading) on my laptop. This is despite me not having any explicit parallelism in the code.

A little more detail: top shows a single process with ~380% CPU use, while htop shows a "parent" process and then 4 "children", each with 1/4 the time and ~100% CPU.

Is this normal? Or does it mean I have got something very wrong somewhere? The code involves many lazy sequences, but at its core modifies a mutable data structure (a mutable - not a Clojure data structure - hash that accumulates results). I am not using any explicit parallelism.

A significant amount of time is likely (I haven't profiled) spent in JCA/JCE (crypto lib) - I am using multiple AES ciphers in CTR mode, each as a stream of secure random bytes (code here), implemented as lazy seqs. Perhaps that is parallelized?

More random ideas: Could this be related to IO? I'm running on an encrypted SSD and this program is processing data from disk, so does a lot of reading. But htop shows system time as red, and these are green.

Sorry for such a vague question. I can post more info if required. This is Clojure 1.4 on 64bit Linux (JDK 1.7.0_05). The code being executed is here but it's pretty messy (more apologies) and spread across various files (most CPU time is spent in nearest-in-dump in the code there). Note - please don't waste time trying to run code to reproduce, as it expects a pre-existing data-dump to be on disk (which isn't in git).

debugger Running in the debugger (thanks, A-M) shows four threads (if I understand the debugger correctly), but only one is executing the program. They are labelled finalizer, main (the program), reference handler, and signal dispatcher. Finalizer + ref handler are in wait state; signal dispatcher has no frames available. I tentatively think this means the parallelism is at a lower level, perhaps in the crypto implementation?

Aha I think it's parallel GC (Java now has a concurrent collector). At the start, CPU use jumps way up when the actual process pauses (it prints out a regular tick). And since it's churning through lots of data it's generating a lot of short-lived objects (confirmed by using -XX:+UseSerialGC which reduces CPU use to 100%)

andrew cooke
  • 45,717
  • 10
  • 93
  • 143
  • 1
    Try running it in a debugger an pause it. – Has QUIT--Anony-Mousse Jul 07 '12 at 01:42
  • Looks like those threads are actually part of the JVM. (e.g. [this question](http://stackoverflow.com/questions/5766026/default-threads-like-destroyjavavm-reference-handler-signal-dispatcher) and [this one too](http://stackoverflow.com/questions/7698861/simple-java-example-runs-with-14-threads-why)) – huon Jul 07 '12 at 01:58
  • thanks. but are those threads separate processes? i suspect they are not what's using the cpu? it's not clear to me how/when/if jvm threads are green or native. i don't normally see all 4 cores active for a single-threaded java program. – andrew cooke Jul 07 '12 at 02:00
  • (You should just post an answer and accept it, in case anyone else is looking for an answer to this question in the future.) – huon Jul 07 '12 at 02:26

1 Answers1

4

OK, I feel a bit dumb posting this as it now looks pretty obvious, but it seems to be parallel GC. I am processing a lot of data (sucking it in from an SSD) and generating lots of short-lived objects. And it appears that the JVM has parallel GC. See http://blog.ragozin.info/2011/12/garbage-collection-in-hotspot-jvm.html

It may also be a sign of a problem - What is going on with java GC? PermGen space is filling up? - which I will investigate tomorrow (I didn't mention it - although in retrospect I should have - but this is borderline running out of memory).

Update: Running with -XX:+UseSerialGC reduces the total CPU use to 100% (ie 1 core). But I didn't really mean that the two explanations above were exclusive, only that with better configuration and/or code I could reduce the amount of GC.

Community
  • 1
  • 1
andrew cooke
  • 45,717
  • 10
  • 93
  • 143
  • Sounds like it is just a lot of GC churn, you probably want to look at ways of reducing allocations in your inner loops (dump data in large pre-allocated arrays, avoid creating interim objects/sequences, avoid primitive boxing etc.) – mikera Jul 07 '12 at 09:45
  • @andrew cooke: You wrote *"and it appears that the JVM has parallel GC"*. Before investigating the second possibility you mention, couldn't you simply first disable the parallel GC (using JVM parameters when you start up the JVM) and see if the behavior continues or not? – TacticalCoder Jul 07 '12 at 10:52