0

Using Java with JDBC
Querying data from a postgres database that stores ship information.
I'm creating a vessel object from the data and adding coordinate data to the vessel object's coordinate list.

    ResultSet rs = stmt.executeQuery("SELECT mmsi, report_timestamp, position_geom, ST_X(position_geom) AS Long, "+ 
                "ST_Y(position_geom) AS Lat FROM reports3 WHERE position_geom IS NOT NULL ORDER by report_timestamp ASC"); 

 TreeMap <Long, Vessel> vessels = new TreeMap<Long, Vessel>(); 
                    long startTime2 = System.nanoTime();

                    while(rs.next()){
                        long mmsi = rs.getLong("mmsi");
                        java.util.Date time = rs.getTime("report_timestamp"); 
                        double longitude = rs.getDouble("Long");
                        double latitude = rs.getDouble("Lat");
                        Coordinate coordinate = new Coordinate(longitude, latitude, time);
                        Vessel vessel = new Vessel(mmsi); 

                        if(!vessels.containsKey(mmsi)) { //if vessel is not present in vessels
                            vessel.addCoor(coordinate);
                            vessels.put(mmsi, vessel);
                        }
                        else { //if vessel is already in vessels 
                            vessels.get(mmsi).addCoor(coordinate);
                        }
                    }

So I'm working with a table that contains a billion rows and it'll be impossible to store that many vessel objects on my machine.

I'm wondering how to iterate through the database by querying 1,000,000 rows at a time with each iteration I'll have enough information to run some methods and store important keys, then clear my vessels tree map and run the next 1,000,000 rows.

guy_sensei
  • 513
  • 1
  • 6
  • 21
  • 1
    The first thing you need to do is to configure the driver to not buffer the complete result in memory: https://jdbc.postgresql.org/documentation/94/query.html#query-with-cursor –  Jun 08 '15 at 13:05
  • ** t'll be impossible to store that many vessel objects on my machine **-- If that's your real problem...then why not define required operations in a stored procedure and fetch limited data thereafter. – Rajesh Jun 08 '15 at 13:24

1 Answers1

1

Try to use setFetchSize(int) method. More about it in this link

Community
  • 1
  • 1
Ihar Sadounikau
  • 741
  • 6
  • 20
  • That link is about the SQL Server JDBC driver - which does behave differently than the Postgres JDBC driver –  Jun 08 '15 at 13:12
  • I used it for Oracle and MySQL databases and it solved the same issue. – Ihar Sadounikau Jun 08 '15 at 13:22
  • Well, for Oracle and MySQL using `setFetchSize()` isn't necessary to begin with, because those drivers do not read the complete result set into memory in the first place. And for Postgres this will only have an effect if autocommit is turned off. For the Microsoft driver you need to use `selectMethod=cursor` and for the jTDS driver you need to use `useCursors=true` to avoid buffering of the whole result. –  Jun 08 '15 at 13:28