12

The underlying problem - let's say my documents have "categories" and timestamps. If I want all documents in the "foo" category that have a timestamp that's within the last two hours, it's simple:

function (doc) {
  emit([doc.category, doc.timestamp], null);
}

and then query as

GET server:5894/.../myview?startKey=[foo, |now - 2 hours|]&endkey=[foo, |now|]

the problem comes when I want something in categories foo or bar, within the last two hours. If I didn't care about time, I could just pull directly by key through the keys collection. unfortunately, I have no such option with ranges.

What I ended up doing in the meantime is rounding the timestamp to two-hour blocks, and then multiplexing the query out:

POST server:5894/.../myview
keys=[[foo, 0 hours], [foo, 2 hours], [bar, 0 hours], [bar, 2 hours]]

It works, but will get messy if I want to go back a large amount of time (in relationship to the blocksize).

Is there a way to send multiple startKey/endKey pairs to a view, akin to the keys: [] array that can be posted for keys?

Octavian Helm
  • 39,405
  • 19
  • 98
  • 102
kolosy
  • 3,029
  • 3
  • 29
  • 48

3 Answers3

9

There is a CouchDB issue request to let you do just that. I've attached a simple, no guarantees patch to 0.10.1 to that ticket which may work for you. It works for me and lets me do things like:

{
    "keys": [
        {
            "startkey": ["0240286524","2010","03","01"],
            "endkey": ["0240286524","2010","03","07",{}]
        },
        {
            "startkey": ["0442257276","2010","03","01"],
            "endkey": ["0442257276","2010","03","07",{}]
        }
    ]
}

in the POST body, which lets me get all the data across multiple tracking ids, for a range of dates. I call with group=true&group_level=1 to have the results grouped by tracking id. Deeper group levels would allow me to group by tracking id|year, tracking id|year|month etc.

Multiple connections were an unscalable overhead for me as I'd be looking to make 2000 concurrently :) (No, a new view is not an option - we're already at 400GB for data plus one view!)

The issue and patch is at https://issues.apache.org/jira/browse/COUCHDB-523 .

majelbstoat
  • 12,889
  • 4
  • 28
  • 26
4

Your probably better off just doing two queries. CouchDB can handle multiple simultaneous queries pretty well so spin off several processes/threads and query for foo and bar docs seperately.

CouchDB does not currently support multiple range queries. ORing and ANDing keys is pretty much not doable in one query.

Jeremy Wall
  • 23,907
  • 5
  • 55
  • 73
4

This has been added in newer versions of CouchDB. To add multiple ranges of start/end keys, you can use a POST request to your view, with a body that looks something like this:

{
  "queries": [
    { "startkey": 10, "endkey": 11 },
    { "startkey": 16, "endkey": 18 }
  ]
}

I know it's an old question but I initially found it when I was looking for exactly this!

Lorna Mitchell
  • 1,819
  • 13
  • 22
  • Any doc references for this? – Isaac May 04 '17 at 02:35
  • Not yet, I have promised to patch the docs as we don't have them yet! I did blog it though if another example would be useful https://lornajane.net/posts/2017/multiple-search-keys-in-couchdb – Lorna Mitchell May 04 '17 at 08:08
  • Ah, I found that the API docs do already exist, they're here: http://docs.couchdb.org/en/2.0.0/api/ddoc/views.html#sending-multiple-queries-to-a-view – Lorna Mitchell Jun 23 '17 at 07:25
  • Ehh, isn't this not really what the OP asked? I understood the goal is not to launch a query which selects records with a (one-dimensional) key from the union of two ranges, but rather to select those records which have a mutli-dimensional key with both dimensions within a certain range. Think of it like an index on 2D spatial data on which you want to launch a query to find points with x in range [x0, x1] and y in [y0, y1], like you would query a quadtree. In OP's post, x0=x1="foo" and y0 = now-2h, y1 = now. – RDM Jul 05 '17 at 08:55
  • This is how to achieve what the OP asked. CouchDB can only filter on a continuous range of values as they come in the view (i.e. sorted by the view's keys). If the results you want aren't next to one another in the view output, you need to fetch multiple ranges using the approach I outlined. Hope that makes it clearer! – Lorna Mitchell Jul 06 '17 at 11:19