27

Here is simple example I've created after reading several topics about jpa bulk inserts, I have 2 persistent objects User, and Site. One user could have many site, so we have one to many relations here. Suppose I want to create user and create/link several sites to user account. Here is how code looks like, considering my willing to use bulk insert for Site objects.

User user = new User("John Doe");

user.getSites().add(new Site("google.com", user));
user.getSites().add(new Site("yahoo.com", user));

EntityTransaction tx = entityManager.getTransaction();
tx.begin();
entityManager.persist(user);
tx.commit();

But when I run this code (I'm using hibernate as jpa implementation provider) I see following sql output:

Hibernate: insert into User (id, name) values (null, ?)
Hibernate: call identity()
Hibernate: insert into Site (id, url, user_id) values (null, ?, ?)
Hibernate: call identity()
Hibernate: insert into Site (id, url, user_id) values (null, ?, ?)
Hibernate: call identity()

So, I means "real" bulk insert not works or I am confused?

Here is source code for this example project, this is maven project so you have only download and run mvn install to check output.

UPDATED:

After Ken Liu kindly advise, I've disabled Site object id auto generation:

    User user = new User("John Doe");
    user.getSites().add(new Site(1, "google.com", user));
    user.getSites().add(new Site(2, "yahoo.com", user));
    entityManager.setFlushMode(FlushModeType.COMMIT);
    EntityTransaction tx = entityManager.getTransaction();
    tx.begin();
    entityManager.persist(user);
    tx.commit();

Now I have following line in debug output:

DEBUG: org.hibernate.jdbc.AbstractBatcher - Executing batch size: 2

It works!

Joel
  • 29,538
  • 35
  • 110
  • 138
Andriy Kopachevskyy
  • 7,276
  • 10
  • 47
  • 56

3 Answers3

22

If you're using the database to generate ids, then Hibernate has to execute a query to generate the primary key for each entity.

Ken Liu
  • 22,503
  • 19
  • 75
  • 98
  • 5
    how are you generating your keys now? You will have to be sure that your keys are unique. – Ken Liu May 05 '10 at 17:13
  • 2
    Wouldn't it be possible to make Hibernate perform `UPDATE sometbl SET counter=counter+1000` before inserting 1000 objects, and then just use the 1000 id's it just reserved? – aioobe Dec 01 '13 at 01:14
  • 1
    @KenLiu Does this means you can't bulk insert multiple entries using oracle when the PK is generated via a sequence`? – bish Mar 11 '17 at 12:17
8

I have found it much more efficient to bypass hibernate for bulk inserts. You must do away with ORM (object relational mapping) but you can still leverage the connection associated with the current session and the transaction management.

While you temporarily lose the convenience of your ORM, the payoff is significant, especially if you have natively generated Ids since hibernate would normally perform one SELECT for each INSERT.

Session.doWork is very handy for facilitating this.

private MyParentObject saveMyParentObject(final MyParentObject parent, final List<MyChildObject> children)
{
    transaction = session.beginTransaction();
    try
    {
        session.save(parent); // NOTE: parent.parentId assigned and returned here

        session.doWork(new Work()
        {
            public void execute(Connection con) throws SQLException
            {
                // hand written insert SQL - can't use hibernate
                PreparedStatement st = con.prepareStatement("INSERT INTO my_child (parent_id, name, ...) values (?, ?, ...)");

                for (MyChildObject child : children)
                {
                    MyChildObject child = new MyChildObject();
                    child.setParentId(parent.getParentId()); // assign parent id for foreign key

                    // hibernate can't help, determine jdbc parameters manually
                    st.setLong(1, child.getParentId());
                    st.setString(2, child.getName());
                    ...
                    st.addBatch();
                }

                // NOTE: you may want to limit the size of the batch
                st.executeBatch();
            }
        });

        // if your parent has a OneToMany relationship with child(s), refresh will populate this 
        session.refresh(parent);
        transaction.commit();
        return parent;
    }
    catch(Throwable e)
    {
        transaction.rollback();
        throw new RuntimeException(e);
    }   
}
pstanton
  • 35,033
  • 24
  • 126
  • 168
  • 1
    I am using this same technique as you provided. But there is still a problem; this method prepares a different statement for each insert as it can be shown in sql profiler. To increase the performance it needs to compile or prepare statement once then call that compiled statement for the rest of the inserts. – rogue lad Nov 25 '15 at 06:22
  • @Max from the DBMS point of view, the statement is the same and will pick up the same access plan. It would be diffent only if variable binding was not used (not the case). – Lluis Martinez Apr 10 '18 at 21:37
  • Saving about 50k records only took about 2 seconds! Amazing, thank you! – Ran Dec 07 '20 at 08:49
5

I have written a short blog which talks about batch insert gotchas and also has pointer to small project that has all the right configurations to get started with batch insert with Hibernate. See the details at http://sensiblerationalization.blogspot.com/2011/03/quick-tip-on-hibernate-batch-operation.html

prabhat jha
  • 59
  • 1
  • 1