3

In a related Scala question, I asked the following:

When I need to read millions of database rows from a PostgreSQL database using the JDBC driver, I always use a cursor, otherwise I will get an OutOfMemoryError. Here is the pattern (pseudocode) that I use:

begin transaction
execute("declare cursor...")
while (true) {
  boolean processedSomeRows = false
  resultSet = executeQuery("fetch forward...")
  while (resultSet.next()) {
    processedSomeRows = true
    ...
  }
  if (!processedSomeRows) break
}
close cursor
commit

How can this be done in idiomatic Haskell?

Community
  • 1
  • 1
Ralph
  • 31,584
  • 38
  • 145
  • 282
  • Have you tried lazily fetching the rows with HDBC or its likes and making sure they are consumed nicely? Does this cause a memory problem? – Sarah Oct 09 '12 at 12:51
  • I have not actually tried this in Haskell. I am looking for a pattern for processing an outer loop that depends on the processing in an inner loop. – Ralph Oct 09 '12 at 13:30

1 Answers1

3

There is quite new concept for dealing with streams like sql cursor: iteratees, or enumerator, or conduit. For example, in terms of the conduit library, from Persistent Book:

runResourceT $ withStmt "declare cursor..." []
    $$ mapM_ doSomethingWithSingleResult

withStmt "declare cursor..." [] creates source with rows, mapM_ doSomethingWithSingleResult creates sink for process single rows, and $$ connects source with sink.

Fedor Gogolev
  • 10,391
  • 4
  • 30
  • 36