1

Suppose you have a log-like collection of documents in CouchDB, as in this tabulated representation of JSON documents and attributes (each row is a JSON document, each column is an attribute):

PRODUCT_ID START_DATE PRICE
0000000001 2016-01-01 100.00
0000000002 2016-01-01 100.00
0000000003 2016-01-01 100.00
0000000001 2016-01-02 100.00
0000000002 2016-01-02 200.00
0000000003 2016-01-02 100.00
0000000001 2016-01-03 100.00
0000000002 2016-01-03 200.00
0000000003 2016-01-03 100.00

Is it possible via a MapReduce View, to produce a schema implementing Ralph Kimball's Slowly Changing Dimension concept?

e.g.:

PRODUCT_ID START_DATE PRICE  END_DATE
0000000001 2016-01-01 100.00 2999-12-31
0000000002 2016-01-01 100.00 2016-01-02
0000000003 2016-01-01 100.00 2999-12-31
0000000002 2016-01-02 200.00 2999-12-31

I'm using Cloudant, which has a few enhancements over base CouchDB.

Related (much broader) question: Data warehousing principles and NoSQL

Community
  • 1
  • 1
Alex R
  • 11,364
  • 15
  • 100
  • 180

1 Answers1

0

In CouchDB, the /database/_all_docs view is sorted by id. If you can tolerate having the view results in two consecutive rows per desired result, this is just a default in CouchDB.

It should be easy enough to coalesce the two rows into one for your application.

If you need more calculation on the doc values, you can create a custom view that works in the same way, using emit(doc._id) as your key: http://docs.couchdb.org/en/2.0.0/couchapp/views/intro.html

Jan Lehnardt
  • 2,619
  • 1
  • 17
  • 14