0

I have a moderately big table on Sql Server, about 40 million rows. In particular this table has an xml column which could be large in size.

I am experiencing very poor performance on retrieving data from this table using last version of mssql-jdbc. I am looking for ways to improve performance both on java side and database side.

On java side this is the relevant code for fetching data:

String connectionUrl = "jdbc:sqlserver://10.10.10.28:1433;databaseName=MYDB;user=MYUSER;password=MYPWD;selectMethod=direct;sendStringParametersAsUnicode=false;responseBuffering=adaptive;";

String query = "SELECT * FROM MYTABLE WHERE DateTimeField > '2019-03-25 00:00:00.0' AND DateTimeField < '2019-03-25 13:00:00.0'";

try (Connection sourceConnection = DriverManager.getConnection(connectionUrl);
    Statement stmt = sourceConnection.createStatement(SQLServerResultSet.TYPE_SS_SERVER_CURSOR_FORWARD_ONLY, SQLServerResultSet.CONCUR_READ_ONLY)   ) {

    stmt.setFetchSize(100); 
    resultSet = stmt.executeQuery(query);

    while (resultSet.next()) {

        // do something

    }

}

I've set the fetch size on 100 rows because I've noticed problems if I set the fetch size too big, probably because of the xml column which is heavy to send on the network.

Does anyone have any suggestion to improve performance for fetching data in this scenario?

revy
  • 3,945
  • 7
  • 40
  • 85
  • On DB side use indexes, it will help to boost the performance at the server end. On Java side, you could use pagination to limit the data on page size and fetch limited data. have a look at this: https://stackoverflow.com/questions/22998281/fetching-millions-of-records-in-java – Ali Azim Mar 25 '19 at 10:58
  • Hi @AliAzim, I am using fetchSize to specify the number of rows retrieved each time. Is it enough or are there other parameters I could use? – revy Mar 25 '19 at 11:03
  • Yes, it's but you should do some research on it. BTW you could use caches to optimize your performance and instead of querying the DB table you could use a stored procedure. – Ali Azim Mar 25 '19 at 11:11

1 Answers1

0

Since you have a filter that is quite narrow, the performance issue is most likely due to lack of index vs size of column.

Do a simple

Select count(*) from ......

And see performance. Most likely that will be slow too and will thus eliminate the issue of XML column.

Saad Ahmad
  • 393
  • 1
  • 7
  • It took about 20 minutes for the select count to complete. So are you suggesting to create an index for the DateTimeField column? – revy Mar 26 '19 at 09:35
  • Yes, by creating index and if selectivity is good it would be fine. Remember number of rows does not affect performance it is access path – Saad Ahmad Mar 26 '19 at 18:29
  • This is a general blog entry around misconception that #rows by itself affects performance. [link]http://saadwmsblog.blogspot.com/2014/05/myth-buster-number-of-rows-in-table-is.html – Saad Ahmad Mar 26 '19 at 19:52