9

I'd like to find out how many bytes are being using in the cache. This is useful in determining reasonable sizing. What are some good ways to tally the number of bytes used in a Google Guava cache?

The stats method doesn't give what I want; it does not include metrics on the number of bytes in the cache.

The asMap method is the best way I've found so far. After getting this information, one could use some of the techniques shown in In Java, what is the best way to determine the size of an object?. But, frankly, these seem fairly painful, at least from a Clojure codebase. In order to avoid some dependencies, I'm currently using a rough shortcut with Nippy, a Clojure serialization library: (count (nippy/freeze (.asMap cache))). I'm looking for better ways.

I am using Google Guava caching from a Clojure codebase, but my question is not necessarily Clojure specific; Java interop is relatively easy in most cases.

Update: Some context in response to a comment below. First, I'd like to know if I'm overlooking a useful part of the Google Guava caching API. Second, I don't know if the generic approaches I linked (for counting memory usage on the heap) apply well to Guava. More broadly, finding cache size utilization is an important use case, so I'm a little surprised it isn't better documented online.

Community
  • 1
  • 1
David J.
  • 31,569
  • 22
  • 122
  • 174
  • 2
    What about this question makes it not an exact duplicate of the linked question? You want to find the size of an object, and it makes you sad that this isn't easy, and you're considering serializing it and counting the bytes: those are exactly the main points of the other question. – amalloy Sep 05 '14 at 18:53
  • 1
    @amalloy Updated above. This question isn't an *exact* duplicate -- though there may be commonalities -- since I asked in the context of Guava. It isn't obvious (to me, at least) that the the approaches shown in the linked question are the best for this use case. – David J. Sep 05 '14 at 19:31
  • 6
    The answer you linked to is "fairly painful" because there _is_ no non-painful way to measure memory usage in bytes of Java objects, including Guava Caches. It's not better documented online because you're usually better off trying to find some other thing to measure. – Louis Wasserman Sep 05 '14 at 20:30

1 Answers1

4

Java (and by extension Guava) does not provide any easy or meaningful way to measure "bytes used" by an object or data structure. Notably there isn't even a single coherent definition of that concept, since an object can be referenced from multiple other objects and there's no notion of bytes being "owned" by a particular data structure. Other languages like Rust have this notion of ownership, but Java does not.

For example, how many bytes does an instance of MyClass use?

public class MyClass {
  private static final int[] BIG_ARRAY = new int[1_000_000];
  private final int[] myArray = BIG_ARRAY;
}

Clearly the class uses a lot of memory, but each instance only uses up a few bytes in order to reference the statically allocated array. You can create thousands of MyClass instances and see very little memory impact, and even if all instances are GCed BIG_ARRAY will stick around. So it seems wrong to say that an instance of MyClass "uses" the backing array's bytes.

You can determine how many bytes the cache itself uses (e.g. to compare it to using a ConcurrentHashMap or another collection) based on the fields and instances a Cache maintains. Guava links to this helpful resource of data structure memory footprints you can reference, but obviously this won't include the contents of the cache, just its structure.

Needless to say, as Louis Wasserman commented, you should look for a different metric that more directly tells you what you need to know. For instance you might be more interested in the hit rate, which tells you how efficiently you're using whatever data your cache is retaining.

dimo414
  • 47,227
  • 18
  • 148
  • 244