1

I've been trying to wrap my head around how paging in Apache Cassandra with the driver functions in GOlang.

I have the following code for fetching rows

/// Assume all other prerequisites.

session, _ := cluster.CreateSession()

session.SetPageSize(100)
var pagestate []byte

query := session.Query(`select * from keyspace.my_table`)

query = query.PageState(pagestate)
if err := query.Exec(); != nil {
   panic(err)
}

iter := query.Iter()

for {
   row := map[string]interface{}{}
   if !iter.MapScan(row) {
      pagestate = iter.PageState()
      break
   }

   /// Do whatever I need with row.

}

What I'm trying to achieve: The table I'm referencing is huge, over 18k rows huge, and I want to fetch all of them for a special operation in the most efficient manner using the driver's built in paging so the query won't time out.

The problem: I'm not sure how to get the query to resume at the previous page state. I'm not sure if this involves running the query in a loop and managing page state outside of it or not. I understand how to get and set page state, I can't figure out how to iterate the query with a new page sate each time without a proper halt condition when all the paging is done.

My best attempt:

var pagestate []byte

query := session.Query(`select * from keyspace.my_table`)

for {
   query = query.PageState(pagestate)
   if err := query.Exec(); != nil {
      panic(err)
   }

   iter := query.Iter()

   /// I don't know if I'm using this bool correct or not.
   /// My assumption is that this would return false when a new page isn't
   /// avaliable, thus meaning that all the pages have been filled and
   /// the loop can exit.
   if !iter.WillSwitchPage() {
      break
   }

   for {
      row := map[string]interface{}{}
      if !iter.MapScan(row) {
         pagestate = iter.PageState()
         break
      }

      /// Do whatever I need with row.
   }
}

Am I doing this right, or is there a better way to achieve this?

GhostRavenstorm
  • 634
  • 3
  • 9
  • 29

1 Answers1

2

So, as it turns out, WillSwitchPage() will never return true at any point in the loop. Not sure why. I guess it's because of how I'm using MapScan() to control the inner loop.

Anyway, I figured out a solution by checking the []byte for the page state itself at the en dof the query loop. If the driver reaches the end and doesn't fill the page, page state will have 0 elements, so I'm using that as my halt condition. This may or may not be the most elegant or intended way to handle the driver's pagination, but it's functioning as wanted.

var pagestate []byte

query := session.Query(`select * from keyspace.my_table`)

for {
   query = query.PageState(pagestate)
   if err := query.Exec(); != nil {
      panic(err)
   }

   iter := query.Iter()

   for {
      row := map[string]interface{}{}
      if !iter.MapScan(row) {
         pagestate = iter.PageState()
         break
      }

      /// Do whatever I need with row.
   }

   if len(pagestate) == 0 {
      break
   }
}
GhostRavenstorm
  • 634
  • 3
  • 9
  • 29
  • Seems relevant to this question/answer. [Cassandra pagination example with Golang](http://www.inanzzz.com/index.php/post/t7fd/cassandra-pagination-example-with-golang) – BentCoder Feb 14 '21 at 14:03