Should we not use hibernate for large databases

Question

Recently I started learning Hibernate and while browsing I came across this site: Hibernate Vs JDBC .

The link says there are 2 tables - User & Contract where each user is having 3 contracts. The number of records in User is 100,000, and the number of records in Contract table is 300,000.

Now the link has given example on how it impacts the performance when we have records in the range of hundred thousands.

I ran the code on my machine and the plain JDBC code took just 486 ms to get User & Contract details by joining both the tables.

Now if we use Hibernate for same operation, then it took considerable amount of time as shown below :

// Using Fetch mode as **@Fetch(FetchMode.SUBSELECT)**
test1 : 11

// Using Fetch mode as **@Fetch(FetchMode.SELECT)**
test2 : 50

// Using Fetch mode as **@Fetch(FetchMode.JOIN)**
test3 : 45
// Using HQL query using **join fetch** option
test4 : 7
// Using Hibernate native SQL query
test4 : 3

The numbers here are given in seconds.

So does it mean that Hibernate is useful only for small projects?

We should use plain JDBC if my database has records of range around few hundred thousands? I think having records of this range is common for many applications then how developers are using hibernate in such cases?

@user1354678, no I just followed the example given in the link I mentioned. — learner, Jun 18 '15 at 16:52
@Seelenvirtuose, The id's of both the tables are declared as primary keys. To my knowledge all the primary keys are indexed by default, let me know if I am wrong. — learner, Jun 18 '15 at 16:55
When you say get in plain jdbc - did you do a select, iterate through the result and map it to to entities programmatically? — 6ton, Jun 18 '15 at 16:56
@6ton, yes exactly. You can see the code given in the link mentioned in my post. — learner, Jun 18 '15 at 16:58
@Jarrod Roberson, Hey, I can explain what is the cause for delay in Hibernate if anyone can open this question. The duplicate link explains hibernate vs JDBC, but it does not clarify the OP on why is it slow in his experiment code.. — The Coder, Jun 18 '15 at 17:54
@user1354678, can you please add it as a comment or answer it, I don't think closing the question will not block you from answering. — learner, Jun 18 '15 at 18:03
Ok, I'll try to add it as comment or I'll type everything in a file and link that file in the comment, as it'll include programatical explanations too. — The Coder, Jun 18 '15 at 18:04
that duplicate does answer this question, hibernate is not a panacea, if it is not the correct tool use another tool, straight JDBC does not have the overhead and degenerate performance problems that hibernate is plagued with. — , Jun 19 '15 at 16:20

score 1 · Answer 1 · edited May 23 '17 at 11:51

1

Well your table may contain hundreds of thousands of entries, but batch processing (and you are doing that when you load that many entries) is probably better done with JDBC or at least not with loading all entries without considering that you load that many entries.

See also: JPA: what is the proper pattern for iterating over large result sets?

edited May 23 '17 at 11:51

Community

1
1

answered Jun 18 '15 at 16:58

user140547

7,750
3
28
80

The link tells about pagination concept of hibernate, so is it the only workaround if we have more data? – learner Jun 18 '15 at 17:07
There is also `ScrollableResults` with a stateless session. Anyway, I think there is nothing "wrong" with using JDBC for batch processing and Hibernate for OLTP. – user140547 Jun 18 '15 at 17:12
1

@user140547 the size of the database doesn't matter much. It's all about the amount of data you load in memory, and the number of queries used to load them. Use cases needed to load 1 lakh users in memory are extremely rare. You typically load 1 user, or some users. – JB Nizet Jun 18 '15 at 17:20
@user140547, Thanks for pointing to `ScrollableResults`. To my knowledge batch processing is used only for `insert` or `update` queries but my post is about `select` queries. Also what is OLTP, can you please provide some example to understand that term? – learner Jun 18 '15 at 17:28
@jb-nizet Actually I was trying to say exactly that – user140547 Jun 18 '15 at 17:28
OLTP = https://en.wikipedia.org/wiki/Online_transaction_processing , in contrast to batch processing. In my opinion, it does not matter if you are selecting or updating, it is batch processing when you read hundreds of thousands of entries – user140547 Jun 18 '15 at 17:31
@user140547, I have seen batch processing only for insert & updates, can you please show me how can we use it for select queries? If possible a link to some doc is fine so I can get the details on it. – learner Jun 18 '15 at 19:08
I am not sure what you mean with "batch processing". In this context, it basically only means that a database operation operates on thousands or hundred of thousands of records. – user140547 Jun 18 '15 at 19:34

score 1 · Answer 2 · answered Jun 18 '15 at 17:00

Hibernate does not optimize performance. There is no magic. It is (at best) can be as fast, as raw JDBC. Every time someone complains about it, I remind them of tuning. Everything needs to be tuned. Even the database itself: indexes and partitioning. Out of the box performance (with everything default) is only suitable for POCs.

What Hibernate does, and BTW you should be using standard JPA, not Hibernate directly, it saves you from writing tedious mapping and other plumbing code that has a tendency to turn into spaghetti mess. Those maintainability problems would kill your project much faster that any performance issues.

Optimizing Hibernate includes proper lazy vs. eager joins, etc. top avoid the N+1 Select problem, as well, as indexing and partitioning. You should have 100% clarity of how it translates its queries into raw SQL. And tweak it when you see something you don't like.

Now, if you have large data sets: billions and trillions of records of some telemetry or statistical data, you should look at the column store NoSQL database aka Big Table. Currently Cassandra is the fastest. It is basically a huge distributed index.

Thanks for answering. The tables already have primary keys so indexes are there, then how to improve the performance in this case, I am not able to understand that based on your answer? Can you please clarify. — learner, Jun 18 '15 at 17:05

Should we not use hibernate for large databases

2 Answers2