12

I am looking into porting a website in CouchDB and it looks very interesting.

However, a big problem is that CouchDB does not seem to support read authentication; all documents within a database are accessable by all readers.

It is suggested elsewhere to use different databases for different reader-groups or to implement reader authentication in another (middle) tier, neither of which is an option for this project where the access is determined by complex, per document ACLs.

I was thinking to implement the authentication in lists and to restrict all access to the CouchDb to these lists. This restriction could be enforced by the simple mod_rewrite clauses in the Apache used as reverse-proxy. The lists would simple fetch the row and check the userCtx against the document's ACL. Something like:

function(head, req) {
  var row;
  while (row = getRow()) {
     if (row.value.ACL[req.userCtx.name])
       send(row.value);
     else
       throw({unauthorized : "You are not allowed to access this resource"});
}

Since I have no experience with CouchDB, and I haven't read about this approach anywhere, I'd like to know whether this approach could work.

Is this a way to implement read access or am I abusing lists for the wrong purpose? Should I not expect such a simple solution is possible with CouchDB?

Community
  • 1
  • 1
Tomas
  • 5,067
  • 1
  • 35
  • 39
  • 1
    Take a look at https://github.com/ermouth/covercouch – it implements read ACL keeping original CouchDB REST API untouched. – ermouth Feb 14 '15 at 22:57

4 Answers4

5

Apache mod_rewrite is a middle tier, so it is not clear what you mean when you say a middle tier is not an option.

Implementing your security policy based on data in couchdb is perfectly fine. However the cost is that you are responsible for the implementation to be correct. It's not as bad as it sounds. Remember, people have been doing this with MySQL web apps for a long time.

The thing to keep in mind is that CouchDB does not support document-level read permissions because it is impractical to track those permissions as the data weaves through all the maps and reduces of the views. For example, say we have a bidding system.

  • There are two bids, mine and yours
  • I have read access to my bid which is $10, but I cannot read your bid document due to middleware policy
  • However I discover a view which computes the average of all bids. The average is $7.50. Therefore I know you bid $5 and I will lower my bid to $6

In other words, if you are wrapping the CouchDB API, you will at least need to whitelist those queries which are allowed. And remember, the vhost and rewrite rules run within CouchDB so simply looking at the incoming query may not be enough.

Hopefully that sheds some light on why read control is at the database level.

JasonSmith
  • 72,674
  • 22
  • 123
  • 149
  • Thanks. What do you mean by "the vhost and rewrite rules run within CouchDB" ? – Tomas Aug 23 '10 at 10:03
  • Could you (or someone) more specifically address whether LISTS are a good way to implement this? – Tomas Aug 24 '10 at 08:32
  • 1
    Hi, Tomas. Short anser: **no**. In general, lists are not a good way to enforce security policy because the user could simply query /_all_docs and see the entire database! – JasonSmith Jan 19 '11 at 03:54
  • Just use a CouchDB Hosting provider which does the proxy part for you. On Smileupps you can define/redefine/remove all domains you want when you want. You can then change or remove your CouchDB root access domain completely, and link other "public" domains to ddoc with filtering rewriting doc rules.. so preventing access to any restricted handlers, like _all_docs and others.. – giowild Feb 18 '15 at 09:00
4

Usually it is sufficient to restrict access to certain views - this can be done via lists as you proposed (thanks for the idea). Using unguessable IDs for documents, you already have some kind of access control for documents. I would avoid iterating through the rows and checking for permissions there, but I don't think that's much of a problem either.

Some have mentioned here that the purpose of lists is to change the format - I don't agree, as even the official CouchDB guide states that lists could even produce json documents.

Another way is to restrict users per database and use selective replication so one database will only contain the data a certain group of users is allowed to access. See couchdb read authentication This is not actually per-user, but maybe anyway an option for you. For details on filtered replication see http://wiki.apache.org/couchdb/Replication

Edit: I just came up with a great idea to enforce per document user permissions via lists with better performance:

  1. You pass the user name as an argument to the view and filter accordingly.
  2. In the list using the view, you check whether the given argument user name is identical to the actual user.

The advantage is that CouchDB, as far as I know, internally uses caching for views. I'm not sure about how the caching works with lists. Also I think iterating and filtering in views is generally faster than in lists.

Community
  • 1
  • 1
  • Thanks for your answer. About your edited idea, this does seem to be useful only in the limited case where no ordering or filtering other then on username is required as, correct? – Tomas Oct 26 '10 at 10:28
  • No, you can use it any way you want, because CouchDB allows sorting with multiple values. Just do something like "emit([doc.user_id, doc.sort_value], doc)". – Johann Philipp Strathausen Oct 31 '10 at 17:29
  • 2
    Nice answer. But remember, the user could fetch /_all_docs and see the whole DB! If you have a proxy layer and you are *sure* that it blocks all unneded URLs, then a list could work. But that is subtle. Rewrites, show/list functions, and views could all potentially open an back-door to the data. – JasonSmith Jan 19 '11 at 03:57
  • Just use a CouchDB Hosting provider which does the proxy part for you. On Smileupps you can define/redefine/remove all domains you want when you want. You can then change or remove your CouchDB root access domain completely, and link other "public" domains to ddoc with filtering rewriting doc rules – giowild Feb 18 '15 at 08:58
2

List functions are reasonable way to enforce read ACL in simple cases, but this approach has several drawbacks.

First, you need something in front of CouchDB to block any read request, that does not pipes through list fn that implements ACL. _all-docs, requests with reduce=true, direct GETs of docs – thay all, and many others, must be blocked. Simplest way is to use Apache and regexp masks.

Second, you must understand that you can not in simple way control access to attachments. Although you can block any read request, that does not match your /db/_design/ddoc/_list/list/view pattern, you can not build effective view+list pair to provide access control to attaches.

It’s absolutely impossible for CouchDB 1.5 and earlier – view index can not include attachment data. It’s nearly impossible in CouchDB 1.6 since processing base64-encoded attaches as JSON is CPU and RAM hog.

Third, in any way, this method is sloooooow. Reason is simple – list functions are not streams. It means first entire response of view fn is grabbed and serialized, then list processor deserializes it again, and then result is processed using list function. And then, again serialized.

ermouth
  • 835
  • 7
  • 12
1

I'm not sure using list is the best option to restrict the access to resources since list are functions that are used to render the ouupt of a view in specific format (RSS, CSV, config files, HTML,...).

Have you considered using a document containing users and their permissions? I found a post by Kore Nordmann which explains how to convert the classical user/group/permissions from relational databases to the CouchDB model:

alt text

Depending on its permissions, a user would have access to only a set of defined views.

CouchDB offers validation functions but they only get called when a document is created or updated. The O’Reilly book states that "The authentication system is pluggable, so you can integrate with existing services to authenticate users to CouchDB using an http layer, LDAP integration, or through other means". But since you mentioned a middle tier is not an option, the list could be a temporary solution until more authentication support is added to CouchDB.

jdecuyper
  • 3,934
  • 9
  • 39
  • 51
  • Thanks. The post by Kore Nordmann uses user/permission tables as an example of data and how it is transformed to CouchDB. It does not seem to address the problem of actually restricting access to resources. – Tomas Aug 20 '10 at 15:06
  • Also, I understand that the purpose of lists is primarily to change the format, I understand that they are also used for filtering beyond the B-tree, right? – Tomas Aug 20 '10 at 15:08
  • You're right about the post not offering a complete solution to the permissions problem, but it is good starting point and as shown in the picture, you could use the "permissions" structure to hold the views available for one particular user. I didn't knew about the list/B-tree relationship but I'm looking for it right now. Do you have any link that explains it in more detail? http://books.couchdb.org/relax/appendix/btrees explains how the B-Trees is (basically) implemented. – jdecuyper Aug 20 '10 at 15:35
  • What I mean: A view offers sorting and filtering lineary on a key. If you want additional filtering within that view you need a list right? I don't completely understand yet, but this is what I gather from the book your referencing. – Tomas Aug 20 '10 at 15:41
  • I update a bit my answer. And yes, I think using the list could be an option to restrict access to some resources. – jdecuyper Aug 20 '10 at 16:19