0

I have a task where I need to insert a lot of data into the database using hibernate. I am currently testing with inserting 500,000 entities which each have one relation, so 1,000,000 inserts in total. Based on this guide I have created the following code which actually works. All the data is inserted and committed without errors.

import javax.annotation.Resource;
import javax.ejb.*;
import javax.persistence.EntityManager;
import javax.persistence.PersistenceContext;
import javax.transaction.*;
import javax.xml.stream.XMLStreamException;
import javax.xml.transform.TransformerException;
import java.io.File;
import java.io.IOException;
import java.text.ParseException;

@Stateless
@TransactionManagement(TransactionManagementType.BEAN)
public class StackoverflowExample {
    @PersistenceContext
    private EntityManager entityManager;

    @Resource
    private SessionContext sessionContext;

    @EJB
    private XmlProcessorFactory xmlProcessorFactory;

    @EJB
    private TaskManagerBean taskManagerBean;

    public void processFile(String[] args, Task task) throws HeuristicRollbackException, RollbackException, HeuristicMixedException, SystemException, IOException, TransformerException, ParseException, XMLStreamException, NotSupportedException {
        UserTransaction tx = null;

        XmlProcessor instance = xmlProcessorFactory.getInstance(new File("data.xml"));

        XmlElement nextElement = instance.getNextElement();
        int i = 0;
        int batchSize = 50;
        entityManager.setProperty("hibernate.jdbc.batch_size", batchSize);
        tx = sessionContext.getUserTransaction();
        tx.begin();
        while (nextElement != null) {
            Entry entry = new Entry(nextElement.getUserReference(), nextElement.getXml());

            entityManager.persist(entry);

            if (i % batchSize == 0) {
                entityManager.flush();
                entityManager.clear();
            }

            nextElement = instance.getNextElement();
            i++;
        }

        task.setStatus(status);
        task.setEndTime(now());
        // This gives the OutOfMemoryError
        entityManager.merge(task);

        tx.commit();
    }
}

This will fail in the line where I call taskManagerBean.update() with the following error:

2017-03-31 08:49:30,212 ERROR [org.jboss.as.ejb3.invocation] (EJB default - 3) WFLYEJB0034:
EJB Invocation failed on component TaskManagerBean for method public void
TaskManagerBean.update(Task,TaskStatus):
javax.ejb.EJBTransactionRolledbackException: org.hibernate.exception.GenericJDBCException:
could not load an entity: [Task#3]
    at org.jboss.as.ejb3.tx.CMTTxInterceptor.handleInCallerTx(CMTTxInterceptor.java:159)
    at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInCallerTx(CMTTxInterceptor.java:256)
    ...
    at TaskManagerBean$$$view18.update(Unknown Source)
    at StoreEntriesBean.processFile(StoreEntriesBean.java:117)
    ...
    at org.jboss.threads.JBossThread.run(JBossThread.java:320)
Caused by: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: could not load an entity: [Task#3]
    at org.hibernate.jpa.spi.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1692)
    ...
    at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInCallerTx(CMTTxInterceptor.java:254)
    ... 104 more
Caused by: org.hibernate.exception.GenericJDBCException: could not load an entity: [Task#3]
    at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:47)
    ...
    at org.hibernate.jpa.spi.AbstractEntityManagerImpl.merge(AbstractEntityManagerImpl.java:1161)
    ... 135 more
Caused by: java.sql.SQLException: Error
    at org.jboss.jca.adapters.jdbc.WrappedConnection.checkException(WrappedConnection.java:1972)
    ...
    at org.hibernate.loader.Loader.loadEntity(Loader.java:2204)
    ... 155 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOfRange(Arrays.java:3664)
    ...
    at org.hibernate.event.internal.DefaultLoadEventListener.doLoad(DefaultLoadEventListener.java:478)

Runtime.getRuntime().freeMemory() returns 71540896 right before the error.

I start Wildfly with -Xmx2048m and a quick glance at memory usage shows that it only uses less than half of it.

I have tried committing the inserted Entries after every 1,000 inserts. This triggers the TaskManagerBean and it will fail the same way after a few updates. In a few occasions I have also seen this error after I finished the huge batch job and the owning entity was to be updated.

I have tried with a file with just 5,000 entries and the entire process works fine.

Is this a bug in the MySQL driver or am I doing something wrong here?

homaxto
  • 5,428
  • 8
  • 37
  • 53
  • What I don't understand is that it fails equally if I insert 50,000 or 500,000 entries. It actually can insert and commit 500,000, but it fails when updating the task when only inserting 50,000. – homaxto Mar 31 '17 at 07:43

3 Answers3

1

Trying to run batch processes through EJBs pretty much never works because you either run out of memory or the transaction times out.

It's one of the reasons that the "Batch Applications for the Java Platform" (aka JSR-352) specification was developed.

WildFly 10 provides you with an implementation of this.

You can read more about it at Batch Applications in Java EE 7 - Undertanding JSR 352 Concepts: TOTD #192

Steve C
  • 18,876
  • 5
  • 34
  • 37
0

Do you close either session or entity manager?

I Never used EntityManager but when you deal with database in Java, you should always "explicitly" close connection when a job(transaction) is done

exiter2000
  • 548
  • 5
  • 14
  • The container takes care of this in a java-ee scenario. – homaxto Mar 31 '17 at 06:50
  • @homaxto check this out. Seems like some disagreenment about clean up. I would suggest try to close it and see if it makes any difference. http://stackoverflow.com/questions/220374/do-i-have-to-close-every-entitymanager – exiter2000 Mar 31 '17 at 16:14
  • @exiter2000 there is not much disagreement there. The highly up-voted accepted answer is correct. People run into trouble when they try to use their own connection managers - never do that in a full Java EE server. – Steve C Mar 31 '17 at 21:53
0

The call to EntityManger.merge() is what causes the error. I am not 100 % familiar with Hibernate but apparently merge will in this case fetch the entire relation from Task to Entry, which in this case was a collection with 500,000 entries - even though this relation is Lasy loaded.

I replaced merge with a Entity.find(Task.class, taskId) and set the status on that instance instead and that solved my problem.

In the meantime I have introduced Java Batch Processing which I can only recommend. It avoids having to write batch jobs yourself.

homaxto
  • 5,428
  • 8
  • 37
  • 53