1

In my application I'm using CSVReader & hibernate to import large amount of entities (like 1 500 000 or more) into database from a csv file. The code looks like this:

        Session session = headerdao.getSessionFactory().openSession();
        Transaction tx = session.beginTransaction();

        int count = 0;
        String[] nextLine;

        while ((nextLine = reader.readNext()) != null) {
            try {

                if (nextLine.length == 23
                        && Integer.parseInt(nextLine[0]) > lastIdInDB) {
                    JournalHeader current = parseJournalHeader(nextLine);
                    current.setChain(chain);
                    session.save(current);
                    count++;
                    if (count % 100 == 0) {
                        session.flush();
                        tx.commit();
                        session.clear();
                        tx.begin();
                    }
                    if (count % 10000 == 0) {
                        LOG.info(count);
                    }

                }

            } catch (NumberFormatException e) {
                e.printStackTrace();
            } catch (ParseException e) {
                e.printStackTrace();
            }

        }
        tx.commit();
        session.close();

With large enough files (somewhere around 700 000 lines) I get out of memory exception (heap space).

It seems that the problem is somehow hibernate related, because if I comment just the line session.save(current); it runs fine. If it's uncommented, the task manager shows continuously increasing memory usage of javaw and then at some point the parsing gets real slow and it crashes.

parseJournalHeader() does nothing special, it just parses an entity based on the String[] that the csv reader gives.

trincot
  • 317,000
  • 35
  • 244
  • 286
northernd
  • 143
  • 1
  • 2
  • 6
  • 1
    You look like your doing the right thing with the session clearing, which should be sorting out memory problems ... one potential might be the 2nd Level Cache, it won't be getting emptied by session.clear. What are its settings ... maybe setting CacheMode.GET? – sMoZely May 26 '11 at 10:13
  • 1
    A stateless session might be useful here. Read the limitations of stateless sessions here : http://docs.jboss.org/hibernate/core/3.6/javadocs/org/hibernate/StatelessSession.html, and if they're not a problem in your use-case, try using it (through sessionFactory.openStatelessSession() ) – JB Nizet May 26 '11 at 10:25

1 Answers1

1

Session actually persists objects in cache. You are doing correct things to deal with first-level cache. But there's more things which prevent garbage collection from happening.

Try to use StatelessSession instead.

Oleg
  • 798
  • 3
  • 7
  • I tried using StatelessSession but with same results. Examining the dump with Eclipse Memory Analyzer tells me that 6 "com.mysql.jdbc.JDBC4PreparedStatement"s are the top leak suspects. They consume pretty much all the memory. Does this tell something about the problem ? – northernd May 26 '11 at 11:24
  • 1
    It seems the problem is somehow related to c3p0 caching the statements. I removed c3p0 configuration from hibernate.cfg and got rid of the memory issues. I'll try to investigate it further – northernd May 26 '11 at 11:45