2

I'm trying to memory-optimize my server that keeps running into OOM.

Most of the objects (by count) in the server take the following form:

  • Each object is a HashMap
  • HashMap keys are strings
  • HashMap values are objects of Attribute class, which just has an int and 2 booleans.

Important caveat: 95% of such hashmaps would ONLY EVER have one key; and I know whether that's the case when creating the hashmap.

There are millions of these hashmaps.

I already asked a separate question about optimizing those hashmaps memory wise, and someone in a comment suggested that perhaps re-engineering my whole data structure would be better since even with initial size of "1" HashMaps still take up extra memory.

As such, my question is; is there a better Java data structure I can implement which can store the same exact data with better memory efficiency?

NOTE: I need to be able to look up whether a specific value is present as a key; therefore I have considered but rejected storing the data in an list of quintuples of [string_value, int, boolean, boolean].

DVK
  • 126,886
  • 32
  • 213
  • 327
  • Can the map keys change after the maps are created? – John Kugelman Oct 18 '17 at 18:04
  • @JohnKugelman - no. The value "attribute" objects can sometimes change but not the keys – DVK Oct 18 '17 at 18:05
  • As an aside, someone on initial questrion mentioned singletonMap. If that's a good answer, I would appreciate someone with a clue offering actual "this is how much memory a regular HashMap takes, this is how much a singletonMap takes" – DVK Oct 18 '17 at 18:06
  • Is this a homemade class/object system? If so, you could save a lot of memory by having classes have string-to-index maps and objects just have arrays of values. – John Kugelman Oct 18 '17 at 18:06
  • @JohnKugelman - sorry, i'm having an attack of stupids. Why would string to index maps save more memory than string to object? – DVK Oct 18 '17 at 18:08
  • What about using a tool such as http://www.mapdb.org? – Mick Mnemonic Oct 18 '17 at 18:08
  • I don't know your usage patterns -- if there are many objects with identical keys, a class system would put those identical keys in one place, with class instances only storing the values. That's much better than all the instances repeating the same key information. – John Kugelman Oct 18 '17 at 18:09
  • @JohnKugelman - not sure what you're driving at, but you did give me a slightly off the wall idea of caching my attribute objects so they do not have duplicates. Thanks! that ought to save memory – DVK Oct 18 '17 at 18:11
  • You could replace all of the single-key HashMaps with one HashMap. If the keys are unique across HashMaps, that's trivial. Otherwise, change the key to include the higher-level context. So if you had a HashMap named "foo" and another named "bar", and both can have an "attributes" key, just change the keys to "bar_attributes" and "foo_attributes". Come to think of it, you could replace *all* of your HashMaps with this single one. – Jim Mischel Oct 18 '17 at 18:22
  • "Important caveat: 95% of such hashmaps would ONLY EVER have one key; and I know whether that's the case when creating the hashmap." - any particular reason these need to be HashMaps, and not some specialized one-entry map class? You could probably save a lot of space with a custom map that just has a `key` and a `value` field. (On Java 9, there's even `Map.of`.) – user2357112 Oct 18 '17 at 18:33
  • @user2357112 - because I do need to accomodate the other 5% cases and not sure how to do so in a way that doesn't waste memory as well – DVK Oct 18 '17 at 19:06
  • 1
    @DVK: You don't need to use the same map class for every map. – user2357112 Oct 18 '17 at 19:48

2 Answers2

2

Expose the less specific Map interface to the user instead of HashMap, this given you the freedom to use hash maps or singleton maps depending on the case.

Memory efficient singleton maps can be created via Collections.singletonMap(): https://docs.oracle.com/javase/7/docs/api/java/util/Collections.html#singletonMap(K,%20V)

SingletonMap seems to be implemented as an object with just two fields and some cache: http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/Collections.java#Collections.SingletonMap

p.s. You could have something even more compact for your special case by collapsing the map and value into a single instance like this:

public final class SingletonAttributeMap extends Attribute implements Map<String,Attribute> {

  private final String key;

  public Attribute get(String key) {
    return this.key.equals(key) ? this : null;
  }

  ....
}

p.p.s Is there a maximum size for the integers in the attributes? You might be able to do tricks like this (whether this actually saves memory will depend on padding / alignment, see What is the memory consumption of an object in Java?):

 class Attribute {
    private int value;

    boolean getBoolean1() {
      return (value & 1) != 0; 
    }

    boolean getBoolean2() {
       return (value & 2) != 0;
    }

    int getInt() {
       return value >> 2;
    }

    void setBoolean1(boolean b) {
      value = (value & ~1) | (b ? 1 : 0);
    }

    void setInt(int i) {
      value = (value & ~3) | (i << 2);
    }

    ... 
Stefan Haustein
  • 18,427
  • 3
  • 36
  • 51
1

Each object is a HashMap? Not a very good abstraction. I'd prefer composition: create an object that HAS-A HashMap and provides a clear API. Make it Comparable and implement hashCode and equals.

I would suggest that you follow the Flyweight pattern if lots of those are repeated.

If the objects are immutable and read-only this will be a perfectly good optimization.

Have a Factory object to create instances and maintain a List of unique ones. If someone asks for one that's already in the List return the immutable copy.

You can calculate how much of a memory savings this will provide before you implement it.

duffymo
  • 305,152
  • 44
  • 369
  • 561
  • The question was what's the structure that saves the most memory. "you can calculate" seems to indicate it may have even worse memory performance. – DVK Oct 18 '17 at 19:05
  • When I say "you can calculate", I mean that you can look at the data you have today and answer "This is how many Flyweight immutable instances I can hand out without having to duplicate the objects in memory." The key is immutable data and lots of repeats. If both of those assumptions are true you'll save a lot of memory if you follow my suggestion. – duffymo Oct 18 '17 at 19:58
  • OK, as is often the case with me, I'm smart enough to use the pattern bit not smart enough to know that i'm using a named pattern :) I did the Flyweight thing for my attributes around the time you posted the answer, just didn't realize that was what it was. It saved SOME memory but not enough apparently. I'm not sure what you mean by HAS-A hashmap? you mean I store just 1 value as one member and hashmap (null in case of only 1 value) as another member? – DVK Oct 18 '17 at 20:05
  • I mean encapsulate that implementation detail inside an object instead of using a raw HashMap. Store as much as you like, but do it through public methods that guarantee the proper operation. – duffymo Oct 18 '17 at 21:05