1

Imagine you consume an API that gives you up to 100 elements and allows you to get more elements not by specifying a page number but instead only by telling it which is the last element you received. You would then want to abstract over this quasi-pagination. How would you do that in Scala?

I came up with this probably buggy (= not thoroughly tested) and very stateful piece of code:

abstract class IteratorThatKeepsOnGiving[T] extends Iterator[T] {
  private var currentIter: Iterator[T] = _
  private var currentElement: T = _

  def nextIterator(lastElement: Option[T]): Iterator[T]

  fetchNewBatch(None)

  def fetchNewBatch(lastElement: Option[T]) = {
    currentIter = nextIterator(lastElement)
  }

  def hasNext: Boolean = currentIter.hasNext || {
    fetchNewBatch(Some(currentElement))
    currentIter.hasNext
  }

  def next(): T = {
    currentIter.nextOption() match {
      case Some(value) =>
        currentElement = value
        value
      case None =>
        fetchNewBatch(Some(currentElement))
        next()
    }
  }
}

I don't like it.

Use:

val numberIterator = new IteratorThatKeepsOnGiving[Int] {
  def nextIterator(lastElement: Option[Int]): Iterator[Int] = {
    val i = lastElement.getOrElse(-1)
    ((i + 1) to (i + 4)).iterator
  }
}

There gotta be a better way to do this. Surely I am not the first one to be bothered by this kind of pagination. What's the correct term for this anyway? And how do I abstract over it?

The API in question is Discord's message list API: https://discord.com/developers/docs/resources/channel#get-channel-messages

Alternatives considered

I have found this question: Making a Scala Iterator for a Paginated API

However, in my case I don't have a total page count. This answer also assumes that the page size is constant and whilst this is given in my case I would really prefer a solution that can do with knowing only "The last element I got was X. How can I get some more elements?" and not "I last got X, how do I get the next Y elements?". Besides, that is irrelevant since I can't specify a page number directly.

There's also flatMapConcat (and mapConcat) in Akka Streams which suffers the same problems.

phdoerfler
  • 470
  • 6
  • 19
  • fwiw I have since made a slightly improved and slightly more tested version of above Iterator: https://gist.github.com/phdoerfler/557a4cc4031475def4a9143a9cebccbc – phdoerfler May 04 '21 at 12:10

1 Answers1

3

Well, you could do something like this ... it looks kinda nicer/cleverer, but keeps the entire page of elements in memory as you iterate, rather than just one at a time (this is probably fine though as the paginator is already holding references to them anyway).

 Iterator
   .iterate(fetchNewBatch(None).to(LazyList)) { 
      fetchNewBatch(_.lastOption).to(LazyList)
   }.takeWhile(_.nonEmpty)
    .flatten
Dima
  • 39,570
  • 6
  • 44
  • 70
  • Very interesting! `.iterate`! So that's its name. I remembered there was something like that in the standard library but I stopped looking too soon. Nice to see there is a version which supplies the previous value. That is remarkably helpful! – phdoerfler May 04 '21 at 12:07
  • What's the reason for the `LazyList`? Can't you just do: `val iter: Iterator[Seq[Msg]] = Iterator.iterate(nextBatchOfMessages(None))(lastList => nextBatchOfMessages(lastList.lastOption)); val iter2: Iterator[Msg] = iter.flatMap(_.iterator)` ? – phdoerfler May 04 '21 at 13:10
  • 1
    Well, I don't know what your `nextBatch` returns. I didn't want to constrain it beyond being `IterableOnce`. Also `.flatMap(_.iterator)` === `.flatten`. And you _do_ need the `takeWhile`, because without it, you'll just start over again after you are through. – Dima May 04 '21 at 13:30
  • Great answer which utilizes exactly the function made for this. From the docs: `def iterate[T](start: T)(f: (T) => T): Iterator[T] Creates an infinite iterator that repeatedly applies a given function to the previous result. ` – yǝsʞǝla May 05 '21 at 01:55