1

I am using Hibernate 5.2.8.Final version and we have a requirement where we read millions of data from database and update the data by some business logic, as my database is Huge I want to commit data after my batchsize is reached so I have written below code

Session session = HibernateUtil.getSessionFactory().getCurrentSession();
session.getTransaction().begin();
Query<Object> query = session.createQuery(SQL, Object.class);
ScrollableResults scrollableResults = query.setCacheMode(CacheMode.IGNORE)
        .scroll(ScrollMode.FORWARD_ONLY);
int count = 0;
while (scrollableResults.next())
{
    Object object = (Object) scrollableResults.get(0);
    process(object)
    session.update(object);
    if (++count % batchSizeDeclare== 0)
    {
        session.flush();
        session.clear();
        LOGGER.info("Updated batch records");
    }
}
session.getTransaction().commit();
LOGGER.info("commited in Hibernate ");
}

Below is my hibernate.cfg.xml file

<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE hibernate-configuration PUBLIC
        "-//Hibernate/Hibernate Configuration DTD//EN"
        "http://www.hibernate.org/dtd/hibernate-configuration-3.0.dtd">
<hibernate-configuration>
    <session-factory>
        <!-- Database connection settings -->
        <property name="connection.driver_class">com.microsoft.sqlserver.jdbc.SQLServerDriver</property>
        <property name="connection.url">jdbc:sqlserver://com;database=DEV</property>
        <property name="connection.username">user</property>
        <property name="connection.password">pass</property>

        <property name="hibernate.default_schema">schema</property>

        <!-- JDBC connection pool (use the built-in) -->
        <property name="connection.pool_size">5</property>

        <!-- SQL dialect -->
        <property name="dialect">org.hibernate.dialect.SQLServer2012Dialect</property>

        <!-- Disable the second-level cache -->
        <property name="cache.provider_class">org.hibernate.cache.internal.NoCacheProvider</property>

        <!-- Echo all executed SQL to stdout -->
        <!-- <property name="show_sql">true</property> -->
        <!-- <property name="format_sql">true</property> -->

        <!-- Drop and re-create the database schema on startup -->
        <property name="hbm2ddl.auto">update</property>

        <!-- org.hibernate.HibernateException: No CurrentSessionContext configured! -->
        <property name="hibernate.current_session_context_class">thread</property>

        <property name="hibernate.jdbc.batch_size">100</property>

        <property name="hibernate.c3p0.min_size">5</property>
        <property name="hibernate.c3p0.max_size">20</property>
        <property name="hibernate.c3p0.timeout">300</property>
        <property name="hibernate.c3p0.max_statements">50</property>
        <property name="hibernate.c3p0.idle_test_period">3000</property>

        <mapping class="com.beans.Object" />

    </session-factory>
</hibernate-configuration>

Below is my Object.java

 public class Object implements Serializable
    {
        private static final long serialVersionUID = 1L;

        @Id
        @GeneratedValue(strategy = GenerationType.IDENTITY)
        @Column(name = "ID", nullable = false, unique = true, updatable = false)
        private int id;
        private String filePath;

    private String fileName;
    private String mimeType;
    private double fileSize;
    // removed getters and setters

    }

Once my code reached session.flush() it is not doing any thing even after waiting for 30 min. Is this the correct way to batch commit? How to batch Update?

Vlad Mihalcea
  • 142,745
  • 71
  • 566
  • 911
rajadilipkolli
  • 3,475
  • 2
  • 26
  • 49

1 Answers1

2

Once my code reached session.flush it is not doing anything even after waiting for 30 min.

On the contrary, the database seems to be doing way too much. It's just that you don't see any progress because the database is struggling to cope with the huge amount of work that you submitted.

Is this the correct way to batch commit?

The short answer is No.

You don't have to fetch millions of rows from the DB. You have better options:

  1. You can do the processing in the database, so that you don't pay the price of extracting data and sending it over the network, only to process it in Java.
  2. If you can't process it in the DB, then you need to use a batch processor that only fetches small chunks of data at a time. This way, you can even parallelize the batch processing, which should reduce the overall processing time.
Vlad Mihalcea
  • 142,745
  • 71
  • 566
  • 911
  • One more question, I have set my batchSizeDeclare variable as 100 which is bulk update size in hibernate.cfg.xml file value. For 100 records will it take more than 30 min, Also I see declaring id value as sequence is best. does changing value helps to increase performance – rajadilipkolli Feb 24 '17 at 14:12
  • Log the SQL queries and make sure that only 100 entries ar executed. It's also important to check the DB logs to see what's happening there as well. – Vlad Mihalcea Feb 24 '17 at 14:53
  • In SQL queries its showing only 100, In db logs another batch of data is being inserted which also contains 10 million records data inserted in chunks – rajadilipkolli Feb 24 '17 at 15:06
  • 1
    A batch of data of 10 million entries kinda explains why the server is stuck, don't you agree? – Vlad Mihalcea Feb 24 '17 at 15:11
  • I don't think I have any such example. You might want to take a look on Spring Integration. It offers many EiP constructs to help you with this task. – Vlad Mihalcea Feb 27 '17 at 10:55