How to Read Large data from database using java?

Question

I am having more then 2gb data in my table i need to read more the 1gb data from the single table, i know various option available in db side to achieve this but i need better approach in java code, can any one tell with example java code like parallel processing in multi threading.

example Code

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
 
public class SelectRowsExample {
  
  public static void main(String[] args) {
 
    Connection connection = null;
    try {
 
  // Load the MySQL JDBC driver
 
  String driverName = "com.mysql.jdbc.Driver";
 
  Class.forName(driverName);
               
  String serverName = "localhost";
 
  String schema = "test";
 
  String url = "jdbc:mysql://" + serverName +  "/" + schema;
 
  String username = "username";
 
  String password = "password";
 
  connection = DriverManager.getConnection(url, username, password);
 
   
 
  System.out.println("Successfully Connected to the database!");
 
   
    } catch (ClassNotFoundException e) {
 
  System.out.println("Could not find the database driver " + e.getMessage());
    } catch (SQLException e) {
 
  System.out.println("Could not connect to the database " + e.getMessage());
    }
 
    try {
 
       
Statement statement = connection.createStatement();
 
ResultSet results = statement.executeQuery("SELECT * FROM employee orderby dept");
         
while (results.next()) {
   
  String empname = results.getString("name");
 
  System.out.println("Fetching data by column index for row " + results.getRow() + " : " + empname);
 
   String department = results.getString("department");
 
  System.out.println("Fetching data by column name for row " + results.getRow() + " : " + department);
 
 
}
 

        } catch (SQLException e) {
 
  System.out.println("Could not retrieve data from the database " + e.getMessage());
    }
 
  }
}

Here my query will return name and department details more the 1gb data will come for each department. if i use this way it will surly slow down the application. that's why i thought go for parallel processing in multithreading. any one kindly give me the suggestion to read the huge amount of data quickly.

What makes you think parallel processing or multi threading would improve anything here? What is your current code to read from the db, in what way is it too slow? What is your target speed, what hardware is your java app running on, what hardware is the server running on? Etc..... — luk2302, Sep 02 '20 at 10:46
@luk2302 i am looking for better approach to do this. if i tried to read more then 1 gb data normally the it will affect performance of the application. — KVK, Sep 02 '20 at 10:48
@luk2302 ok could you suggest how to read the large data from db with out any performance issue with some sample code in java 7 — KVK, Sep 02 '20 at 10:50
No. There is no performance issue to begin with unless you state **VERY CLEARLY** what performance you are trying to achieve. *"performance issue"* is a nice buzzword that means absolutely nothing on its own. — luk2302, Sep 02 '20 at 10:50
Hey! You didn't mention what technology/framework are you using. Also I'm curious what is the use-case because a lot depends on it. Is it needed by a maintenance process or by - I hope not - a client request? — Nandor, Sep 02 '20 at 10:53
*"if i use this way it will surly slow down the application"* - so what? What is your use case, what are your requirements? Note than 1gb of data is a minuscule amount. If you read 1TB of data you might get into trouble. And in both cases java cannot solve the issue alone, it depends on the db, the network, the hardware, etc. You provided NO context whatsoever. — luk2302, Sep 02 '20 at 11:23

Nandor · Answer 1 · 2020-09-02T13:53:17.667

In your example you don't have to use high calibre gun like paralellism. Also it doesn't necessarily solves your problem because there could be a lot of bottlenecks because of hardware, network, etc as luk2302 mentioned it.

There are two much easier tweaks:

Select only those data that you really need. Even if your employee record has 3 columns you can spare the 1/3 of the data which results in speed increase and lower memory consumption. Not to mention if it has much more columns.

ResultSet results = statement.executeQuery("SELECT name, department FROM employee orderby dept");

The default fetchSize won't be enough. It's value depends on the driver, but for example by default when Oracle JDBC runs a query, it retrieves a result set of 10 rows at a time from the database cursor. I know that you are using MySql but it should be about the same. Increasing it you can reduce the overall trip count to the database cursor which is costy. Therefore I recommend it to increase it to 500 or 1000, but you can even experiment with higher values. More info on fetchSize: What does Statement.setFetchSize(nSize) method really do in SQL Server JDBC driver?

Statement statement = connection.createStatement();
statement.setFetchSize(1000);

+1 - System.out.println also slows down your code. You can read about it here: Why is System.out.println so slow? But it's better to replace with a logger library or at least for testing purposes you can use something like this:

if(results.getRow()%1000 == 0) {
    System.out.println("Fetching data by column index for row " + results.getRow() + " : " + empname);
}

Br, Nandor

How to Read Large data from database using java?

1 Answers1