12

I have this setup

@Table(name ="A")
EntityA {
    Long ID;
    List<EntityB> children;
}

@Table(name ="B")
EntityB {
    Long ID;
    EntityA parent;
    EntityC grandchild;
}

@Table(name ="C")
EntityC {
    Long ID;
}

The SQL query is this (I omitted irrelevant details):

select top 300 from A where ... and ID in (select parent from B where ... and grandchild in (select ID from C where ...)) order by ...

The sql query in direct database or through Hibernate (3.5) SQL runs 1000 faster than using Criteria or HQL to express this.

The SQL generated is identical from HQL and Criteria and the SQL I posted there.

[EDIT]: Correction - the sql was not identical. I didn't try the Hibernate style parameter setting on the management studio side because I did not realize this until later - see my answer.

If I separate out the subqueries into separate queries, then it is fast again.

I tried

  • removing all mappings of child, parent, ect.. and just use Long Id references - same thing, so its not a fetching, lazy,eager related.
  • using joins instead of subqueries, and got the same slow behaviour with all combinations of fetching and loading.
  • setting a projection on ID instead of retrieving entities, so there is no object conversion - still slow

I looked at Hibernate code and it is doing something astounding. It has a loop through all 300 results that end up hitting the database.

private List doQuery(
        final SessionImplementor session,
        final QueryParameters queryParameters,
        final boolean returnProxies) throws SQLException, HibernateException {

    final RowSelection selection = queryParameters.getRowSelection();
    final int maxRows = hasMaxRows( selection ) ?
            selection.getMaxRows().intValue() :
            Integer.MAX_VALUE;

    final int entitySpan = getEntityPersisters().length;

    final ArrayList hydratedObjects = entitySpan == 0 ? null : new ArrayList( entitySpan * 10 );
    final PreparedStatement st = prepareQueryStatement( queryParameters, false, session );
    final ResultSet rs = getResultSet( st, queryParameters.hasAutoDiscoverScalarTypes(), queryParameters.isCallable(), selection, session );

// would be great to move all this below here into another method that could also be used
// from the new scrolling stuff.
//
// Would need to change the way the max-row stuff is handled (i.e. behind an interface) so
// that I could do the control breaking at the means to know when to stop

    final EntityKey optionalObjectKey = getOptionalObjectKey( queryParameters, session );
    final LockMode[] lockModesArray = getLockModes( queryParameters.getLockOptions() );
    final boolean createSubselects = isSubselectLoadingEnabled();
    final List subselectResultKeys = createSubselects ? new ArrayList() : null;
    final List results = new ArrayList();

    try {

        handleEmptyCollections( queryParameters.getCollectionKeys(), rs, session );

        EntityKey[] keys = new EntityKey[entitySpan]; //we can reuse it for each row

        if ( log.isTraceEnabled() ) log.trace( "processing result set" );

        int count;
        for ( count = 0; count < maxRows && rs.next(); count++ ) {

            if ( log.isTraceEnabled() ) log.debug("result set row: " + count);

            Object result = getRowFromResultSet( 
                    rs,
                    session,
                    queryParameters,
                    lockModesArray,
                    optionalObjectKey,
                    hydratedObjects,
                    keys,
                    returnProxies 
            );
            results.add( result );

            if ( createSubselects ) {
                subselectResultKeys.add(keys);
                keys = new EntityKey[entitySpan]; //can't reuse in this case
            }

        }

        if ( log.isTraceEnabled() ) {
            log.trace( "done processing result set (" + count + " rows)" );
        }

    }
    finally {
        session.getBatcher().closeQueryStatement( st, rs );
    }

    initializeEntitiesAndCollections( hydratedObjects, rs, session, queryParameters.isReadOnly( session ) );

    if ( createSubselects ) createSubselects( subselectResultKeys, queryParameters, session );

    return results; //getResultList(results);

}

In this code

final ResultSet rs = getResultSet( st, queryParameters.hasAutoDiscoverScalarTypes(), queryParameters.isCallable(), selection, session );

it hits the database with the full SQL, but there are no results collected anywhere.

Then it proceeds to go through this loop

for ( count = 0; count < maxRows && rs.next(); count++ ) {

Where for every one of the expected 300 results, it ends up hitting the database to get the actual result.

This seems insane, since it should already have all the results after 1 query. Hibernate logs do not show any additional SQL being issued during all that time.

Anyone have any insight? The only option I have is to go to native SQL query through Hibernate.

che javara
  • 748
  • 1
  • 8
  • 18
  • 1
    That depends on the fetch size, also comparing SQL execution time and the result from hibernate is comparing apples with coconuts. Hibernate does a lot more for instance converting the result into objects which takes time depending on your mappings and amount of results can take a lot of time. – M. Deinum Oct 07 '15 at 19:21
  • I edited the question, i tried setting projection on ID as well removing mappings, so its not entity conversion related, it is really triggered by the joins/subqueries – che javara Oct 07 '15 at 19:57
  • 2
    Had the same issue with Hibernate being slow as hell months back. Ended up dropping it for anything requiring more than 10 records in a database. Don't bother searching the web for insight or fixes for this I spent days doing that and came up with nothing. Best solution; for anything requiring any sort of intensive database manipulation suck it up and code the queries yourself and cast the data into objects yourself. I personally had a ~500% decrease in run time by doing that with a data intensive app. – gcalex5 Oct 07 '15 at 20:09
  • 2
    That's the price you pay for an **O**bfuscate**R**elational**M**odel –  Oct 07 '15 at 20:18
  • Depending on your query hibernate can return a lot of rows quickly. However when you run into slow downs you can still take advantage of HIbernate but write your query. `Session.createSqlQuery()` will allow you to write your own SQL. `SqlQuery.addEntity()` will tell Hibernate to create the object for you. – markbernard Oct 07 '15 at 20:19
  • 1
    How is Hibernate supposed to collect the data returned in the resultset without looping through the resultset? Of course it does that. Your hand-coded JDBC code would have to do that as well, unless of course you want to execute a query without caring about what it returns. – JB Nizet Oct 07 '15 at 21:46
  • My guess is that it's rather an index or driver problem. Have you tried measuring the execution of the same query (and the retrieval of the data) with plain JDBC? – JB Nizet Oct 07 '15 at 21:51
  • @JB Nizet I didn't try plain jdbc, but I tried native SQL with Hibernate which works fast. The Result Set still doesn't make sense. It should be able to get the response set from a single call - why make a request to the database for every single result? Its like some unnecessary scrolling. – che javara Oct 08 '15 at 13:57
  • I don't see where it does a request to the database for every single result. You said it yourself BTW: Hibernate logs do not show any additional SQL being issued during all that time. – JB Nizet Oct 08 '15 at 14:12
  • @JB Nizet exactly, the logs dont show sql, but in the loop (for ( count = 0; count < maxRows && rs.next(); count++ ) { I can verify on the stack trace that a request is being made to the database over a socket connection originating from rs.next() – che javara Oct 08 '15 at 14:19
  • That's the JDBC driver doing its job of fetching data from the database. It doesn't have anything to do with Hibernate. – JB Nizet Oct 08 '15 at 15:32
  • @JB Nizet if you look at the code above List doQuery, that's Hibernate code an Hibernate is driving the calls to rs.next(). In other words, Hibernate is telling the driver to fetch every individual row from the database. I could understand if I made a scrolling request, but I didn't. Are you saying that it is impossible with any driver to get more than one row in a single call? – che javara Oct 08 '15 at 19:25
  • Again: rs.next() is absolutely necessary to get the data returned by the query. How would you get data out of the resultset without calling rs.next()? Have you ever used JDBC? Learn how it works before making incorrect assumptions. – JB Nizet Oct 08 '15 at 19:29
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/91785/discussion-between-che-javara-and-jb-nizet). – che javara Oct 08 '15 at 19:31
  • so you are saying then, that there is no such thing as getting all results, you always must scroll through every one individually. Actually deeper in that code I saw that it was making a call per column of every row! – che javara Oct 08 '15 at 19:33
  • Yes, that's what I'm saying. A JDBC ResultSet is a bit like an Iterator over a collection, except the collection is not (necessarily) in memory, but elements are retrieved over the network while scrolling through the resultset. And for each row of the resultset, you of course also have to get each column individually. See https://docs.oracle.com/javase/tutorial/jdbc/basics/retrieving.html – JB Nizet Oct 08 '15 at 19:38
  • 1
    @Thanks JB Nizet for the clarification - I looked over the basics. That being said this is the most asinine thing Ive heard yet.The databases giving back row data in individual calls - It would be like a browser client making a separate request for each property and attribute of an HTML page individually. I can't imagine how this would be friendly to performance. Looking even deeper under the covers it looks like jdbc has a fetch size which is how many rows it wants on a single trip - seems the default is 10. https://docs.oracle.com/cd/E11882_01/java.112/e16548/resltset.htm#JJDBC28621 – che javara Oct 13 '15 at 13:54

2 Answers2

6

I finally managed to get to the bottom of this. The problem was being caused by Hibernate setting the parameters separately from the actual SQL query that involved subqueries. So native SQL or not, the performance will be slow if this is done. For example this will be slow:

String sql = some sql that has named parameter = :value
SQLQuery sqlQuery = session.createSQLQuery(sql);
sqlQuery.setParameter ("value", someValue);
List<Object[]> list = (List<Object[]>)sqlQuery.list();

And this will be fast

String sql = some native sql where parameter = 'actualValue'
SQLQuery sqlQuery = session.createSQLQuery(sql);
List<Object[]> list = (List<Object[]>)sqlQuery.list();

It seems that for some reason with letting Hibernate take care of the parameters it ends up getting stuck in the resultSet fetching. This is probably because the underlying query on the database is taking much longer being parameterized. I ended up writing the equivalent of Hibernate Criteria and Restrictions code that sets the parameters directly as above.

che javara
  • 748
  • 1
  • 8
  • 18
  • 14
    this solution is opening you up to a potential sql injection attack. – jtahlborn Oct 13 '15 at 14:31
  • Thanks, noted this issue – che javara Oct 20 '15 at 15:00
  • You say you use Hibernate 3.5, have you also tried with a more recent version such as 4.3? I don't experience any of these performance problems myself - although I must admit I don't use the Hibernate API directly, I use JPA. – Gimby Nov 04 '15 at 20:15
  • I have not tried higher version of hibernate, but I confirmed the slowdown is not in hibernate code, but on the database side triggered from Hibernate code when it is supplied with named parameters. It depends on the query that you are doing and the total number of rows in your database before you notice an effect. – che javara Nov 10 '15 at 19:13
  • Are you using Sql Server? Probably your problem is related with the jTDS driver and varchar parameter. Sometimes you think the hibernate are doing `where parameterr = 'actualValue'` but it is doing `where parameterr = N'actualValue'` – Dherik Mar 30 '18 at 16:14
6

We noticed a similar behaviour in our system.

And also encountered that writing the query with hardcoded parameters instead of using setParameter() would fixed the issue.

We are using MS SQL Server and after further investigation we noticed the the root cause of our issue is a default configuration of the sql server driver that transmits the query parameters as unicode. This lead to our indices being ignored since they were based on the ascii values on the queried columns.

The solution was to setup this property in the jdbc url : sendStringParametersAsUnicode=false

More details can be found here. https://stackoverflow.com/a/32867579

BraveHeart
  • 302
  • 2
  • 11