1

I am trying to efficiently retrieve data that is in the following format:

{
  "datetime": "1453845345493",
  "someIds": ["id2000-4", "id1000-34", "id2000-43", "id250-34"]
}

Specifically, what I want to do is "find all records that have happened since a given time, and of those, return any that have one or more of a list of Ids."

So far I followed the method shown here for creating a compound, multi-index using the following:

r.db("dbName").table("tableName")
  .indexCreate(
    "idDatetime", 
    function(each) {
      return each("someIds").map(function(id){
        return [each("datetime"), id]
      })
    }
    ,{multi: true})

This successfully builds an index based on values that look like ["1453845345493", "id2000-4"]

But now it feel like I'm in a bit too deep, and don't actually know how to make a query that uses this index to accomplish the objective above. How do you craft that query?

Community
  • 1
  • 1
Brideau
  • 4,564
  • 4
  • 24
  • 33

2 Answers2

1

I think(I maybe wrong), basically, we have two types of index lookup:

  • exactly match: get, getAll
  • range match: between

So in your case, it's clearly we cannot use getAll because you want to find all records that have happened since a given time, and of those, return any that have one or more of a list of Id.

That leave us only between. So let's find a way to model it.

I suggest to change datetime field to numeric instead of string. I guess you are storing epoch.

We will create index as you did:

r.table('t11').indexCreate('idDatetime', function(doc) {
  return doc('someIds').map(function(id){
        return [doc("datetime"), id]
  })
}, {multi: true})

Then we query it similar to this:

r.table('t11')
 .between([1453845345493, "id1000-34"], [r.maxval, "id1000-34"], {index: 'idDatetime'})

To find all the document since that epoch time and contains id1000-34. You can find the epoch of yesterday using either JavaScript or RethinkDB datetime function.


UPDATE:

While it isn't perfect, We can simulate either id with sth like this:

r.expr(["id1000-34", "id1000-4"])
  .concatMap(function(needle) {
    return r.table('t11')
     .between([1453845345499, needle], [r.maxval, needle], {index: 'idDatetime'})
  })
  .distinct()
kureikain
  • 2,304
  • 2
  • 14
  • 9
  • This approach works well if I want to filter by time and then look up a single tag, but not if I want to filter by time and then look up multiple tags. An example I mentioned in another comment was "find all the messages since yesterday that contain either 'id2000-1', 'id2000-3' or 'id2000-5'." I've read deeper into the structure of indexes in RethinkDB and I'm not sure it's possible at this time... – Brideau Jan 28 '16 at 15:01
  • Yes, you're right. That isn't possible at this time. However, we can do some work-around. I updated the answer with some ifno – kureikain Jan 28 '16 at 18:02
0

If you haven't you should definitely check out this page on multi-indexes in our docs.

An example using the data you have shown would go like this:

r.table("tableName").getAll("id2000-4", {index: "idDatetime"}).run(conn, callback)

This should get you all documents in that table which have id2000-4 in that array.

Let me know if this works out for you!

dalanmiller
  • 3,467
  • 5
  • 31
  • 38
  • The difficulty that I'm having is that I'm looking to filter on two things: a date range, and then to see if the object has one of many ids. For example, find all the messages since yesterday that contain either 'id2000-1', 'id2000-3' or 'id2000-5'. – Brideau Jan 27 '16 at 18:10