I need some help understanding some GHC heap profiles. I'm working on a new output format for an existing compiler for the Grammatical Framework language.
Before
Profiling a normal run of the current version of the compiler looks like this:
13,347,660,488 bytes allocated in the heap
213,062,924,208 bytes copied during GC
740,585,528 bytes maximum residency (567 sample(s))
4,844,112 bytes maximum slop
1438 MiB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 12312 colls, 0 par 1.321s 1.382s 0.0001s 0.0025s
Gen 1 567 colls, 0 par 206.442s 208.534s 0.3678s 0.8388s
INIT time 0.001s ( 0.005s elapsed)
MUT time 68.757s ( 68.986s elapsed)
GC time 109.557s (111.086s elapsed)
RP time 0.000s ( 0.000s elapsed)
PROF time 98.206s ( 98.830s elapsed)
EXIT time 0.000s ( 0.001s elapsed)
Total time 276.521s (278.908s elapsed)
%GC time 0.0% (0.0% elapsed)
Alloc rate 194,128,534 bytes per MUT second
Productivity 60.4% of total user, 60.2% of total elapsed
After
With the changes I've made, it now looks like this:
15,996,529,552 bytes allocated in the heap
216,727,207,600 bytes copied during GC
889,791,520 bytes maximum residency (578 sample(s))
5,417,952 bytes maximum slop
1741 MiB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 14915 colls, 0 par 1.641s 1.721s 0.0001s 0.0014s
Gen 1 578 colls, 0 par 226.744s 230.323s 0.3985s 1.1982s
INIT time 0.001s ( 0.005s elapsed)
MUT time 70.046s ( 70.242s elapsed)
GC time 120.183s (122.695s elapsed)
RP time 0.000s ( 0.000s elapsed)
PROF time 108.201s (109.349s elapsed)
EXIT time 0.000s ( 0.000s elapsed)
Total time 298.431s (302.291s elapsed)
%GC time 0.0% (0.0% elapsed)
Alloc rate 228,370,682 bytes per MUT second
Productivity 59.7% of total user, 59.4% of total elapsed
So this new version uses more time and memory than the previous one, and my goal is to reduce those as much as possible.
- The first part of both graphs looks the same, this stage of the compiler (namely reading binary files) has not been touched and thus I would expect it to look identical.
- After that, the graphs are hugely different. I feel like the huge cliff edge in graph #2 is telling me something, but I don't quite know what.
- The blue area dominating the second half of graph #2 (24956) is my newly added code. As I haven't aggressively optimised that code, I would expect there to be scope for improvement there. But do these different graphs give any hints about laziness/strictness or where to investigate further?