Java Iterator backed by a ResultSet

Question

I've got a class that implements Iterator with a ResultSet as a data member. Essentially the class looks like this:

public class A implements Iterator{
    private ResultSet entities;
    ...
    public Object next(){
        entities.next();
        return new Entity(entities.getString...etc....)
    }

    public boolean hasNext(){
        //what to do?
    }
    ...
}

How can I check if the ResultSet has another row so I can create a valid hasNext method since ResultSet has no hasNext defined itself? I was thinking doing SELECT COUNT(*) FROM... query to get the count and managing that number to see if there's another row but I'd like to avoid this.

score 42 · Answer 1 · answered Dec 08 '09 at 21:48

42

You can get out of this pickle by performing a look-ahead in the hasNext() and remembering that you did a lookup to prevent consuming too many records, something like:

public class A implements Iterator{
    private ResultSet entities;
    private boolean didNext = false;
    private boolean hasNext = false;
    ...
    public Object next(){
        if (!didNext) {
            entities.next();
        }
        didNext = false;
        return new Entity(entities.getString...etc....)
    }

    public boolean hasNext(){
        if (!didNext) {
            hasNext = entities.next();
            didNext = true;
        }
        return hasNext;
    }
    ...
}

answered Dec 08 '09 at 21:48

rsp

23,135
6
55
69

I think this solution is better than mine, since it has a smaller runtime fingerprint. – ComSubVie Dec 08 '09 at 21:51
2

BalusC raises a good point: there are good reasons not to do this. But there are also good reasons TO do this, like using Java Streams functionality to do computation on ResultSets when you know that will happen relatively fast. – Diogenes Creosote Feb 16 '17 at 03:28
1

To wrap it up it should also manage the closing of the rowset when you are done with it. – Florian F Sep 17 '17 at 20:43

BalusC · Accepted Answer · 2015-07-17T11:16:53.473

42

This is a bad idea. This approach requires that the connection is open the whole time until the last row is read, and outside the DAO layer you never know when it will happen, and you also seem to leave the resultset open and risk resource leaks and application crashes in the case the connection times out. You don't want to have that.

The normal JDBC practice is that you acquire Connection, Statement and ResultSet in the shortest possible scope. The normal practice is also that you map multiple rows into a List or maybe a Map and guess what, they do have an Iterator.

public List<Data> list() throws SQLException {
    List<Data> list = new ArrayList<Data>();

    try (
        Connection connection = database.getConnection();
        Statement statement = connection.createStatement("SELECT id, name, value FROM data");
        ResultSet resultSet = statement.executeQuery();
    ) {
        while (resultSet.next()) {
            list.add(map(resultSet));
        }
    }

    return list;
}

private Data map(ResultSet resultSet) throws SQLException {
    Data data = new Data(); 
    data.setId(resultSet.getLong("id"));
    data.setName(resultSet.getString("name"));
    data.setValue(resultSet.getInteger("value"));
    return data;
}

And use it as below:

List<Data> list = dataDAO.list(); 
int count = list.size(); // Easy as that.
Iterator<Data> iterator = list.iterator(); // There is your Iterator.

Do not pass expensive DB resources outside the DAO layer like you initially wanted to do. For more basic examples of normal JDBC practices and the DAO pattern you may find this article useful.

edited Jul 17 '15 at 11:16

answered Dec 08 '09 at 22:18

BalusC

1,082,665
372
3,610
3,555

37

Couldn't this run into memory issues depending on the number of rows? – Jordan Messina Dec 09 '09 at 00:22
3

Just query the information you actually need. In SQL you can use under each the `WHERE` and `LIMIT/OFFSET` clauses for this. Google also doesn't return all of the zillion results at once. – BalusC Dec 09 '09 at 02:41
56

What if I need all rows? I've got an app with 1,2 million rows that needs to be processed. – heffaklump Apr 23 '10 at 09:58
2

@heffaklump Even in this case you don't need a result set iterator. Iterators are not safe for result sets if the iterators are delegated outside DAO (or even its methods). Callback functions handling one row are much better: my current project operates with hundreds of million records, and they *all* can be easily exported to a CSV file no matter how much the data is, because there is no need to cache whole data neither `List` (whatever) nor `WHERE`/`LIMIT`: simply delegate an object created from a result set row to the callback. – Lyubomyr Shaydariv Feb 09 '13 at 03:12
9

I don't see why this is such a bad idea in general. `Iterators` are very short-lived objects. Let your `ResultSetIterator` also implement `AutoCloseable` and the whole thing becomes quite lean... – Lukas Eder Mar 03 '14 at 15:39
@Lukas: it won't autoclose/release the `Statement`/`Connection`. Particularly you should understand that. Better use `CachedRowSet` if you really need to. – BalusC Mar 03 '14 at 15:58
4

The lifecycle of a `Statement` / `Connection` is completely independent of the lifecycle of a `ResultSet`. This is of course assuming that anyone writing a `ResultSetIterator` knows what they're doing and won't let the `Iterator` escape the DAO or whatever entity manages the `ResultSet` lifecycle. And then, I still don't see why `Iterator` would be a bad type to represent a database cursor... – Lukas Eder Mar 03 '14 at 16:55
1

@Lukas: *"and won't let the Iterator escape the DAO"*, well there you protected yourself. Of course such an iterator is useful within the API/library you're maintaining, but it's usually useless and unsafe there outside. I'd rather make it a DAO-package-private class. – BalusC Mar 03 '14 at 17:04
1

Wow. Don't get pissed :-) Neither I nor the OP had ever implied anything that you're suggesting, at least not in explicitly written form. The question was merely about how to implement `hasNext()`, which is why I thought this answer is not the best one for anyone Googling *"ResultSet" and "Iterator"*. Because it really isn't. The best answer should provide a correct implementation. – Lukas Eder Mar 03 '14 at 17:09
9

This is a really bad solution! If I have a long-running processing job, I _want_ it to use a cursor and keep the connection open. Reading the entire database into a List is very very bad protocol. I want a Stream type interface for a darn good reason! – PlexQ Apr 12 '16 at 18:57
@PlexQ: so you're basically asking this question? http://stackoverflow.com/questions/2447324 That's quite different from the question currently asked. – BalusC Apr 12 '16 at 19:00
2

@BalusC Not really, no. As a functional programmer, it's my approach to treat all large datasets as a stream of individual results and process them thusly. This means that I can process an arbitrarily large dataset using a very small amount of resources, and parallelize across as many CPUs as I have access to. To do this however, I need a result set that can be treated as a Stream. Cursors in any good SQL Database like Oracle provide this functionality on the SQL End, which will stream me a result set in small increments rather than trying to give me gigs of data in one dump. – PlexQ Apr 13 '16 at 22:23
@BalusC I can create a thing of type Stream from a thing of type Iterable, so to take a ResultSet derived from a cursor on the SQL end, and generate an Iterator from it looks like exactly the question asked here, and also the solution provided below. I would consider converting a SQL ResultSet directly to a fully materialized List to be an anti-pattern, unless there is a very specific reason why you need it in this form. – PlexQ Apr 13 '16 at 22:25
4

This is a bad answer. It's not always a good idea to read everything into memory in stead of keeping a ResultSet open. It's not always a good idea to force a DAO abstraction over what you are doing. Someone who is searching for this, will probably be working on some sort of batch job, and will indeed want to keep the result set open. – morten Dec 01 '17 at 14:25
The Iterator approach allows you to separate the JDBC read from the iteration, so your data source does not have to be a JDBC operation. – pojo-guy Jan 24 '23 at 00:50

score 5 · Answer 3 · answered Dec 08 '09 at 21:56

5

ResultSet has an 'isLast()' method that might suit your needs. The JavaDoc says it is quite expensive though since it has to read ahead. There is a good chance it is caching the look-ahead value like the others suggest trying.

answered Dec 08 '09 at 21:56

mrvisser

924
5
9

score 5 · Answer 4 · answered Jan 05 '12 at 10:48

5

You can use ResultSetIterator, just put your ResultSet in the constructor.

ResultSet rs = ...    
ResultSetIterator = new ResultSetIterator(rs);

answered Jan 05 '12 at 10:48

B. Broto

163
1
2
10

I'm 11 years late, but this is still the best answer. – pojo-guy Jan 24 '23 at 01:52

score 4 · Answer 5 · answered May 27 '15 at 18:44

public class A implements Iterator<Entity>
{
    private final ResultSet entities;

    // Not required if ResultSet.isLast() is supported
    private boolean hasNextChecked, hasNext;

    . . .

    public boolean hasNext()
    {
        if (hasNextChecked)
           return hasNext;
        hasNext = entities.next();
        hasNextChecked = true;
        return hasNext;

        // You may also use !ResultSet.isLast()
        // but support for this method is optional 
    }

    public Entity next()
    {
        if (!hasNext())
           throw new NoSuchElementException();

        Entity entity = new Entity(entities.getString...etc....)

        // Not required if ResultSet.isLast() is supported
        hasNextChecked = false;

        return entity;
    }
}

NobodyMan · Answer 6 · 2012-03-23T03:22:54.860

One option is the ResultSetIterator from the Apache DBUtils project.

BalusC rightly points out the the various concerns in doing this. You need to be very careful to properly handle the connection/resultset lifecycle. Fortunately, the DBUtils project also has solutions for safely working with resultsets.

If BalusC's solution is impractical for you (e.g. you are processing large datasets that can't all fit in memory) you might want to give it a shot.

score 3 · Answer 7 · answered Jun 12 '12 at 21:11

I agree with BalusC. Allowing an Iterator to escape from your DAO method is going to make it difficult to close any Connection resources. You will be forced to know about the connection lifecycle outside of your DAO, which leads to cumbersome code and potential connection leaks.

However, one choice that I've used is to pass a Function or Procedure type into the DAO method. Basically, pass in some sort of callback interface that will take each row in your result set.

For example, maybe something like this:

public class MyDao {

    public void iterateResults(Procedure<ResultSet> proc, Object... params)
           throws Exception {

        Connection c = getConnection();
        try {
            Statement s = c.createStatement(query);
            ResultSet rs = s.executeQuery();
            while (rs.next()) {
                proc.execute(rs);
            }

        } finally {
            // close other resources too
            c.close();
        }
    }

}


public interface Procedure<T> {
   void execute(T t) throws Exception;
}


public class ResultSetOutputStreamProcedure implements Procedure<ResultSet> {
    private final OutputStream outputStream;
    public ResultSetOutputStreamProcedure(OutputStream outputStream) {
        this.outputStream = outputStream;
    }

    @Override
    public void execute(ResultSet rs) throws SQLException {
        MyBean bean = getMyBeanFromResultSet(rs);
        writeMyBeanToOutputStream(bean);
    }    
}

In this way, you keep your database connection resources inside your DAO, which is proper. But, you are not necessarily required to fill a Collection if memory is a concern.

Hope this helps.

score 3 · Answer 8 · answered Aug 23 '11 at 15:27

Its not a really bad idea in the cases where you need it, it's just that you often do not need it.

If you do need to do something like, say, stream your entire database.... you could pre-fetch the next row - if the fetch fails your hasNext is false.

Here is what I used:

/**
 * @author Ian Pojman <pojman@gmail.com>
 */
public abstract class LookaheadIterator<T> implements Iterator<T> {
    /** The predetermined "next" object retrieved from the wrapped iterator, can be null. */
    protected T next;

    /**
     * Implement the hasNext policy of this iterator.
     * Returns true of the getNext() policy returns a new item.
     */
    public boolean hasNext()
    {
        if (next != null)
        {
            return true;
        }

        // we havent done it already, so go find the next thing...
        if (!doesHaveNext())
        {
            return false;
        }

        return getNext();
    }

    /** by default we can return true, since our logic does not rely on hasNext() - it prefetches the next */
    protected boolean doesHaveNext() {
        return true;
    }

    /**
     * Fetch the next item
     * @return false if the next item is null. 
     */
    protected boolean getNext()
    {
        next = loadNext();

        return next!=null;
    }

    /**
     * Subclasses implement the 'get next item' functionality by implementing this method. Implementations return null when they have no more.
     * @return Null if there is no next.
     */
    protected abstract T loadNext();

    /**
     * Return the next item from the wrapped iterator.
     */
    public T next()
    {
        if (!hasNext())
        {
            throw new NoSuchElementException();
        }

        T result = next;

        next = null;

        return result;
    }

    /**
     * Not implemented.
     * @throws UnsupportedOperationException
     */
    public void remove()
    {
        throw new UnsupportedOperationException();
    }
}

then:

    this.lookaheadIterator = new LookaheadIterator<T>() {
        @Override
        protected T loadNext() {
            try {
                if (!resultSet.next()) {
                    return null;
                }

                // process your result set - I use a Spring JDBC RowMapper
                return rowMapper.mapRow(resultSet, resultSet.getRow());
            } catch (SQLException e) {
                throw new IllegalStateException("Error reading from database", e);
            }
        }
    };
}

score 2 · Answer 9 · answered Dec 08 '09 at 21:49

You could try the following:

public class A implements Iterator {
    private ResultSet entities;
    private Entity nextEntity;
    ...
    public Object next() {
        Entity tempEntity;
        if ( !nextEntity ) {
            entities.next();
            tempEntity = new Entity( entities.getString...etc....)
        } else {
            tempEntity = nextEntity;
        }

        entities.next();
        nextEntity = new Entity( entities.getString...ext....)

        return tempEntity;
    }

    public boolean hasNext() {
        return nextEntity ? true : false;
    }
}

This code caches the next entity, and hasNext() returns true, if the cached entity is valid, otherwise it returns false.

I believe you mean `return nextEntity != null;` in your `hasNext` method — Brad Mace, Sep 21 '11 at 22:01

score 2 · Answer 10 · answered Dec 08 '09 at 21:49

There are a couple of things you could do depending on what you want your class A. If the major use case is to go through every single result then perhaps its best to preload all the Entity objects and throw away the ResultSet.

If however you don't want to do that you could use the next() and previous() method of ResultSet

public boolean hasNext(){
       boolean next = entities.next();

       if(next) {

           //reset the cursor back to its previous position
           entities.previous();
       }
}

You do have to be careful to make sure that you arent currently reading from the ResultSet, but, if your Entity class is a proper POJO (or at least properly disconnected from ResultSet then this should be a fine approach.

score 2 · Answer 11 · answered Nov 15 '16 at 14:03

Here's my iterator that wraps a ResultSet. The rows are returned in the form a Map. I hope you'll find it helpful. The strategy is that I always bring one element in advance.

public class ResultSetIterator implements Iterator<Map<String,Object>> {

    private ResultSet result;
    private ResultSetMetaData meta;
    private boolean hasNext;

    public ResultSetIterator( ResultSet result ) throws SQLException {
        this.result = result;
        meta = result.getMetaData();
        hasNext = result.next();
    }

    @Override
    public boolean hasNext() {
        return hasNext;
    }

    @Override
    public Map<String, Object> next() {
        if (! hasNext) {
            throw new NoSuchElementException();
        }
        try {
            Map<String,Object> next = new LinkedHashMap<>();
            for (int i = 1; i <= meta.getColumnCount(); i++) {
                String column = meta.getColumnName(i);
                Object value = result.getObject(i);
                next.put(column,value);
            }
            hasNext = result.next();
            return next;
        }
        catch (SQLException ex) {
            throw new RuntimeException(ex);
        }
    }
}

score 1 · Answer 12 · answered Dec 08 '09 at 21:40

entities.next returns false if there are no more rows, so you could just get that return value and set a member variable to keep track of the status for hasNext().

But to make that work you would also have to have some sort of init method that reads the first entity and caches it in the class. Then when calling next you would need to return the previously cached value and cache the next value, etc...

score 1 · Answer 13 · answered Mar 14 '14 at 04:46

Iterators are problematic for traversing ResultSets for reasons mentioned above but Iterator like behaviour with all the required semantics for handling errors and closing resources is available with reactive sequences (Observables) in RxJava. Observables are like iterators but include the notions of subscriptions and their cancellations and error handling.

The project rxjava-jdbc has implementations of Observables for jdbc operations including traversals of ResultSets with proper closure of resources, error handling and the ability to cancel the traversal as required (unsubscribe).

score 0 · Answer 14 · answered Feb 01 '13 at 07:21

I think there's enough decry over why it's a really bad idea to use ResultSet in an Iterator (in short, ResultSet maintains an active connection to DB and not closing it ASAP can lead to problems).

But in a different situation, if you're getting ResultSet (rs) and are going to iterate over the elements, but you also wanted to do something before the iteration like this:

if (rs.hasNext()) { //This method doesn't exist
    //do something ONCE, *IF* there are elements in the RS
}
while (rs.next()) {
    //do something repeatedly for each element
}

You can achieve the same effect by writing it like this instead:

if (rs.next()) {
    //do something ONCE, *IF* there are elements in the RS
    do {
        //do something repeatedly for each element
    } while (rs.next());
}

score 0 · Answer 15 · answered Dec 08 '09 at 21:46

0

Do you expect most of the data in your result set to actually be used? If so, pre-cache it. It's quite trivial using eg Spring

  List<Map<String,Object>> rows = jdbcTemplate.queryForList(sql);
  return rows.iterator();

Adjust to suit your taste.

answered Dec 08 '09 at 21:46

Dan

10,990
7
51
80

Art · Answer 16 · 2014-09-25T10:04:51.050

0

It can be done like this:

public boolean hasNext() {
    ...
    return !entities.isLast();
    ...
}

edited Sep 25 '14 at 10:04

answered Sep 24 '14 at 10:28

Art

1,302
13
25

I think it would be more helpful for the OP and further visitors, when you add some explaination to your intension. – Reporter Sep 24 '14 at 10:57
Author was asking how to implement hasNext() based on a given ResultSet. My suggestion is to use isLast() method of entities ResultSet. It throws SQLException so try { ... } catch is needed - I just left it as '...' to not clutter the example code. I hope it's more clear now :) – Art Sep 25 '14 at 10:05

score -1 · Answer 17 · answered Dec 08 '09 at 21:40

-1

It sounds like you are stuck between either providing an inefficient implementation of hasNext or throwing an exception stating that you do not support the operation.

Unfortunately there are times when you implement an interface and you don't need all of the members. In that case I would suggest that you throw an exception in that member that you will not or cannot support and document that member on your type as an unsupported operation.

answered Dec 08 '09 at 21:40

Andrew Hare

344,730
71
640
635

Implementation of hasNExt does not need to be inefficient. Two booleans are not efficient in my book. – user482745 Jun 19 '18 at 11:10

Java Iterator backed by a ResultSet

17 Answers17

Linked