So, yesterday, I was asked to fix a performance problem in our system: A certain search was exceeding the timeout (600s) when performed.
First thing I did was to dig up which DAO was performing that search, and found it to be using the following hibernate query:
SELECT DISTINCT u FROM TripLogHeader u
LEFT JOIN u.eventHeaderSet ug
LEFT JOIN ug.eventLocomotiveTrainSet elts
WHERE 1 = 1
AND elts.eventLocomotiveTrainPrefix LIKE :eventLocomotiveTrainPrefix
AND u.tripLogStateId.tripLogStateId >= :tripLogStateGreaterOrEqual
ORDER BY u.tripLogHeaderId DESC
My instinct told me that:
Since this query would only return results where the eventLocomotiveTrainPrefix
was set, therefore, we had no need for the LEFT JOIN
and a simple JOIN
would do the trick.
Well, that worked, the query that was exceeding 600s returned in miliseconds with the plain JOIN
instead the LEFT JOIN
.
But why?
My first assumption was that the database was taking too long to return all results, so I ran a few tests to try and prove that:
Running a simple SQL query on the database, without WHERE
clause, using a LEFT JOIN
, returned around 100 thousand results, in miliseconds.
Running the same SQL query, without WHERE
clause, using an INNER JOIN
, returned around 1 thousand results in miliseconds.
Adding a WHERE
clause, with both, LEFT
or INNER
join returns the same 40 results, both in miliseconds.
Making some research, I saw someone telling to try and run the query without the ORDER BY
clause. Did that, and the result is still miliseconds with the LEFT JOIN
!
I cannot figure out why LEFT JOIN
+ ORDER BY
causes the query to timeout, when any of those used by themselves doesn't.
Additionally, the query generated by hibernate (hibernate.show_sql = true
) is the following:
Hibernate:
select
distinct top 20 triploghea0_.trip_log_header_id as trip_log1_62_,
triploghea0_.event_header_timestamp_begin as event_he2_62_,
triploghea0_.event_header_timestamp_end as event_he3_62_,
triploghea0_.map_block_name_destination as map_bloc4_62_,
triploghea0_.map_block_name_origin as map_bloc5_62_,
triploghea0_.own_rollingstock_entry_tag as own_roll6_62_,
triploghea0_.trip_log_header_adherence as trip_log7_62_,
triploghea0_.trip_log_header_consumption as trip_log8_62_,
triploghea0_.trip_log_header_distance as trip_log9_62_,
triploghea0_.trip_log_header_eficiency as trip_lo10_62_,
triploghea0_.trip_log_header_flush_tstamp as trip_lo11_62_,
triploghea0_.trip_log_header_stoped_time as trip_lo12_62_,
triploghea0_.trip_log_header_total_time as trip_lo13_62_,
triploghea0_.trip_log_state_id as trip_lo15_62_,
triploghea0_.user_entry_registry as user_en14_62_
from
trip_log_header triploghea0_
left outer join
event_header eventheade1_
on triploghea0_.trip_log_header_id=eventheade1_.trip_log_header_id
left outer join
event_loco_train eventlocom2_
on eventheade1_.event_header_id=eventlocom2_.event_header_id
where
1=1
and (
eventlocom2_.event_loco_train_prefix like ?
)
and triploghea0_.trip_log_state_id>=?
order by
triploghea0_.trip_log_header_id desc
Running that query in the database, using the same JDBC through Netbeans, brings the same 40 results, in miliseconds as well.
If the database wasn't the one performing poorly, what was? That sounded fishy, so I went to research some more. Found something interesting about the MS SQL Server driver actually changing the way the query was sent. I changed the driver, tested all the options possible in the driver. Nothing, nada. Same behavior all over again.
Trying changing SQL Server's query plan configuration and still, no change.
I'm clueless. Why would the combination of LEFT JOIN
and ORDER BY
on Hibernate make a query hang, but removing either of the same query doesn't?
Why running the same query directly with the JDBC doesn't hangs?
Oh yes, I also tried changing ALL @OneToMany relationships in all three different entities to FetchType.LAZY
to no avail (the original entities are using FetchMode.SUBSELECT
, for reasons unknown to me).
UPDATE:
I tried adding full constructors to the Entity classes, just because I am trying anything, really, and still no difference.
Seems the problem is really a combination of LEFT JOIN
and ORDER BY
. But no idea why.
Any ideas, no matter how far fetched are most welcome, I ran out of ideas myself. Thanks for your time!