1

I have one table with 3,244,977 Registers and 154.70 MB sized (data from phpmyadmin)

I'm running a standalone java application and trying to load all this data through hibernate. My domain class is:

@Entity
public class Register {

    @Id
    @Column(nullable = false, unique = true, updatable = false)
    private Long userId;

    private Date checked;

    @Column(nullable = false)
    private RegisterType tipo;

    private boolean preLiked = false;
    private boolean preCommented = false;

}

Where RegisterType is an Enum which hibernate translates in an int.

So as you can see my domain class is not that complex, considering that java will add some overhead to the data size stored in the database I setted my heap space to 4GB and I run my application with:

java -Xmx4G -cp '....classpath.....' com.tomatechines.bot.Starter

So even if the objects got 10 times bigger it should fit in the heap.

But i'm getting java.lang.OutOfMemoryError: Java heap space

I was afraid that was other load together with this big amount of data and then i made a test... created a standalone jar that just try to load all data in that table without any other variable... but i'm still getting the exception.

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at java.util.HashMap.resize(HashMap.java:703)
        at java.util.HashMap.putVal(HashMap.java:662)
        at java.util.HashMap.put(HashMap.java:611)
        at org.hibernate.internal.util.collections.IdentityMap.put(IdentityMap.java:94)
        at org.hibernate.engine.internal.StatefulPersistenceContext.addCollection(StatefulPersistenceContext.java:846)
        at org.hibernate.engine.internal.StatefulPersistenceContext.addUninitializedCollection(StatefulPersistenceContext.java:817)
        at org.hibernate.type.CollectionType.getCollection(CollectionType.java:739)
        at org.hibernate.type.CollectionType.resolveKey(CollectionType.java:436)
        at org.hibernate.type.CollectionType.resolve(CollectionType.java:429)
        at org.hibernate.engine.internal.TwoPhaseLoad.doInitializeEntity(TwoPhaseLoad.java:151)
        at org.hibernate.engine.internal.TwoPhaseLoad.initializeEntity(TwoPhaseLoad.java:125)
        at org.hibernate.loader.Loader.initializeEntitiesAndCollections(Loader.java:1132)
        at org.hibernate.loader.Loader.processResultSet(Loader.java:992)
        at org.hibernate.loader.Loader.doQuery(Loader.java:930)
        at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:336)
        at org.hibernate.loader.Loader.doList(Loader.java:2610)
        at org.hibernate.loader.Loader.doList(Loader.java:2593)
        at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2422)
        at org.hibernate.loader.Loader.list(Loader.java:2417)
        at org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:109)
        at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1787)
        at org.hibernate.internal.CriteriaImpl.list(CriteriaImpl.java:363)
        at com.tomatechines.utils.hibernate.GenericDAO.find(GenericDAO.java:183)

this same query runs on phpmyadmin in less than one second.

Is it supposed to get that bigger when ran in java? hibernate makes the things grow 30 times the normal size in a database? how can i handle this without make the heap space bigger?

trincot
  • 317,000
  • 35
  • 244
  • 286
Rafael Lima
  • 3,079
  • 3
  • 41
  • 105

3 Answers3

1

Well for roughly 3_300_000 Register objects you are going to get around 160MB just for the 2 headers of those Objects(Register itself, Long, Date and RegisterType); not even talking about their internals - that is already more then postgres reports; so yes you will use quite a lot of heap space.

Looks like Hibernate puts those entries in a Map for whatever reason, that means it wraps some instances in a LinkedNode or TreeNode, those have a Key and a Value, that is at least more 160MB just for the headers and so on...

You could measure how much exactly each Object would take via jol for example; but that will not give you anything - you are still going to fail with an OutOfMemory. I would first think about why do I need close to 3.5 million entries in memory first; if there is a compelling reason for that I would try to go with something lower the Hibernate.

Eugene
  • 117,005
  • 15
  • 201
  • 306
  • can you tell me exactly what is jol? i searched google but it returns me some french expression. And actually i really need all the database in memory for a batch process of validation, and as i can't spare more memory just for heap i will lead to a pagination process – Rafael Lima Apr 12 '17 at 18:07
  • 1
    @RafaelLima I would also give a shot using this: https://docs.jboss.org/hibernate/orm/3.5/api/org/hibernate/StatelessSession.html – Eugene Apr 12 '17 at 18:29
0

You can increase heap size like this: https://stackoverflow.com/a/6452812/3978990

Or paginate data from database and process data by parts.

Community
  • 1
  • 1
0

I'm running a standalone java application and trying to load all this data through hibernate... But i'm getting java.lang.OutOfMemoryError: Java heap space

Whether or not this is expected depends on the details of your Register class. You are loading 3 million of your Register objects using hibernate. For each of those you are loading userId which is Long (it should be long primitive because it's nullable = false) and the RegisterType field so that's another 6 million objects. I'm not sure about checked if it is in the database as well – you might need to say @Transient or something to not have it loaded as well. If it is loaded then there's another object per Register. I'm also not sure if the RegisterType field is loaded eagerly which would then be even more objects.

So depending on the fields, you can be talking about 6 to 15+ million objects in memory and there is a lot of overhead in hibernate. It is trying to add objects to the internal identity maps which are part of its caching. You'll need certainly more than 128mb of memory to do this.

The big question is whether or not you actually need all of this data in memory to do your processing. Can you page over the database instead so only load in 1000 (let's say) objects at one time?

Gray
  • 115,027
  • 24
  • 293
  • 354