0

I am currently trying to find good documentation on Geo Couch and to see if i am able to implement n-dimensional indexing. I need to implement geo spacial functionally. This i found to be a naive solution to my problem. It being that i have a 12 dimensional feature space, which can be thought of immediately as a metric space with say the Euclidian Metric, i need to cluster points in this 12 dim space and query k-nearest neighbors, if anyone has a better solution or can point me in the right direction when using Geo Couch and CouchDB please respond to this post.

1 Answers1

0

The Couchbase documentation for geospatial views currently reflects only the old API, so it is not much help for the newer multidimensional features.

The best documentation I can point you to for that is at https://github.com/couchbase/geocouch/wiki/Spatial-Views-API. Under the Array heading, you'll find:

As the spatial views are now multi-dimensional, you specify the key as array where every element is one dimension. Every dimension can either be a single value or a range. Only numbers supported (there's one special case for GeoJSON geometries, see below).

And in the Queries section you'll see that:

The queries for the spatial view have two new query parameters (start_range and end_range) which are preferred over the bbox parameter.

Basically you can emit a key like [0.0001, -0.0001, [2012,2014]] to perhaps indicate the presence of an object near Null Island over a range of two years. Then you could query start_range=[-0.5, -0.5, 2013]&end_range=[0.5, 0.5, null] to find everything in that vicinity since 2013 and any time after, thus overlapping that sample item.

I do not think k-nearest search has been released, although I think there was a prototype patch at one point. You might inquire through the Couchbase forums, GeoCouch issue tracker, or perhaps asking @vmx directly. You could perhaps implement a "poor man's version" by limiting results and searching larger/smaller bounding boxes until the right result set is found — obviously not as optimal depending on how your data is distributed.

natevw
  • 16,807
  • 8
  • 66
  • 90