I want to start determining the memory requirements of my refs and how they grow with application usage. How can I do this?
3 Answers
Someone asked about this on the mailing list a while ago (and probably someone else before that, and...). A few people provided utilities that kinda-sorta do what you might want, but I still prefer my answer: you can't do this in a language with such pervasive and automatic structural sharing. How do you calculate the size of a large object that you have two pointers to, etc etc.

- 89,153
- 8
- 140
- 205
-
I didn't understand the answer. What do you mean by "two pointers"? – yazz.com Apr 05 '11 at 09:26
-
Also, what do you mean by "pervasive and automatic structural sharing"? – yazz.com Apr 05 '11 at 09:29
-
I mean the same thing that @Joost means: you can create one large object, and have two symbols that both refer to it, or two entries in a vector that both point to the same large object. Even if it were trivial to implement a total-memory counter that were aware of that, it would be difficult to interpret its results meaningfully regardless of how it were designed, – amalloy Apr 06 '11 at 00:17
In general, this isn't really a very useful thing to do: because the data used by any single ref is likely to be shared by many other refs, knowing this information isn't particularly useful.
Also, it will be highly JVM-specific - different JVM implementations may use different amounts of memory for the same Clojure structures depending on how they choose to pack data structures and pointers. For example, I believe that HotSpot pads object sizes up to the nearest 8 bytes, but other JVMs could do something completely different. Also 32/64-bit JVMs will typically use different sizes for pointers (but not necessarily in the obvious way, as some 64-bit JVMs use compressed pointers....)
If you are still determined to do this, the best approach would probably be to recursively descend the data structure in the ref and add up the estimated size of each sub-element.
- You'd need to make assumptions or experimentally verify the size/overhead of each possible component type. Not easy... see this question for some of the gory details of estimating object sizes on the JVM. If you're lucky, you might be able to find a library that does this for you.
- You would also need to keep track of all objects visited - which is a also bit tricky since you'd need to compare on object identity rather than equality, and hence you wouldn't be able to use any of the standard hashmap/set types. A hashmap of (object hashcode -> collection of objects with the same hashcode) would work.
- There will also be some fun Clojure-specific corner cases to consider... e.g. are you counting meta data on a data structure or not?
On average though, I'd still recommend paying attention to the memory consumed by your application as a whole, rather than specific refs.
Because I can't do code in comments:
(let [a1 large-hash-map]
[a2 (assoc large-hash-map :foo :bar)]
;; now, the two 'pointers' are a1 and a2
;; and the data structures they point to can share
;; most (but not all) of their data.
;; making it more or less meaningless to ask
;; how much memory any of the bindings holds
)
Whether we're talking about refs or plain bindings doesn't matter as far as your question is concerned.

- 17,633
- 3
- 44
- 53
-
Isn't it possible to only traverse the objects if they have different identity though? Is that even possible? Maybe keep a list of already visited objects? – yazz.com Apr 05 '11 at 09:40
-
Yeah, it's possible. But not directly from pure clojure, AFAIK. Identity in clojure is - for various reasons - mostly kept to values. For instance, two hash-maps with the same key - value pairs will be identical by design, even if they're two separate objects with differing implementations. – Joost Diepenmaat Apr 05 '11 at 18:47