2

Background

  • I have a CouchDB cluster running a few databases.
  • One of these has docs with a few hundred pieces of data in a somewhat complex structure. (a 'human' for example with height, weight, eye color, hair color, clothing, gps position, and a few hundred other things)
  • I want to look for intersections between a couple of data points, ie BLUE eyes and BLACK hair.
  • I have hundreds factorial possible combinations I could search for. I do these searches fairly rarely.
  • I write to this database quite a lot.

What I want to Do

use a temporary view to pass in a map/reduce for these interesction lookups (queries) when they occure.

Why not?

The docs tell me it's a terrible idea.

The Question

Why is it a terrible idea? Is it really a terrible idea?

Bonus Points

If it is a terrible idea what's a good idea? A view for every combination is silly many views, and loading the whole pile of data into another program for this feels overkill (I'm noticing lucene has some tools for this and I could cook up a Node one if I had to). I could move to a tool like that if I had to, I just don't yet understand why.

Suni
  • 633
  • 8
  • 16

1 Answers1

3

Temporary views are only intended for development use, as they are forced to rebuild the entire view index each time they are invoked, and their results are not saved like a typical view. This will have a bigger penalty the more documents in your database, and it will bite you quickly if you try to use it as a dynamic query system. (source v1.6.1 documentation) As a matter of fact, temporary views are dropped entirely from v2. (source v2.0.0 upgrade notes)

I'm not sure which version of CouchDB you are running, but if you are using v1 and you want to do a highly dynamic query here, you may be much better served incorporating some sort of fulltext indexer, such as apache lucene or elasticsearch. They will add lots of flexibility to your searching, in addition to supporting multiple parameters simultaneously.

If you are using CouchDB v2, you can also consider using the new Mango Query Server which adds mongodb-style syntax for querying documents. With this feature, you can definitely include multiple parameters and do more dynamic searching.

Dominic Barnes
  • 28,083
  • 8
  • 65
  • 90
  • A: thanks for the point out of mango query (I'll move to 2.0 for that!) B: why would the Mango Query perform much better than a temp view? C: all I want is direct map/reduce T_T but I know I'm greedy. – Suni Feb 13 '17 at 21:06
  • The mango view engine creates indexes based on the properties you query, so it isn't rebuilding on the fly like temp views. – Dominic Barnes Feb 13 '17 at 22:38
  • Are those indexes recalculated for each write? And are they ever rolled our? AKA if I have a couple queries a day for a year will I end up having slow writes since I've so many indexes? – Suni Feb 14 '17 at 14:19
  • 1
    That's the thing, it _doesn't even make an index_, it iterates **every** single document **every** single time you send the temp view request. – Dominic Barnes Feb 14 '17 at 17:24