Raw query and row level access control over multiple models in Django

Question

I'm trying to provide an interface for the user to write custom queries over the database. I need to make sure they can only query the records they are allowed to. In order to do that, I decided to apply row based access control using django-guardian.

Here is how my schemas look like

class BaseClass(models.Model):
    somefield = models.TextField()
    class Meta:
        permissions = (
            ('view_record', 'View record'),
        )

class ClassA(BaseClass):
    # some other fields here
    classb = models.ForeignKey(ClassB)

class ClassB(BaseClass):
    # some fields here
    classc = models.ForeignKey(ClassC)

class ClassC(BaseClass):
    # some fields here

I would like to be able to use get_objects_for_group as follows:

>>> group = Group.objects.create('some group')
>>> class_c = ClassC.objects.create('ClassC')
>>> class_b = ClassB.objects.create('ClassB', classc=class_c)
>>> class_a = ClassA.objects.create('ClassA', classb=class_b)
>>> assign_perm('view_record', group, class_c)
>>> assign_perm('view_record', group, class_b)
>>> assign_perm('view_record', group, class_a)
>>> get_objects_for_group(group, 'view_record')

This gives me a QuerySet. Can I use the BaseClass that I defined above and write a raw query over other related classes?

>>> qs.intersection(get_objects_for_group(group, 'view_record'), \
                    BaseClass.objects.raw('select * from table_a a'
                                          'join table_b b on a.id=b.table_a_id '
                                          'join table_c c on b.id=c.table_b_id '
                                          'where some conditions here'))

Does this approach make sense? Is there a better way to tackle this problem?

Thanks!

Edit:

Another way to tackle the problem might be creating a separate table for each user. I understand the complexity this might add to my application but:

The number of users will not be more than 100s for a long time. Not a consumer application.
Per our use case, it's quite unlikely that I'll need to query across these tables. I won't write a query that needs to aggregate anything from table1, table2, table3 that belongs to the same model.
Maintaining a separate table per customer could have an advantage.

Do you think this is a viable approach?

Just validated that I cannot do intersection on RawQuerySet. — ertan, Nov 22 '17 at 01:50

score 3 · Answer 1 · answered Nov 28 '17 at 21:51

After researching many options I found out that I can solve this problem at the database level using Row Level Security on PostgreSQL. It ends up being the easiest and the most elegant.

This article helped me a lot to bridge the application level users with PostgreSQL policies.

What I learned by doing my research is:

Separate tables could still be an option in the future when customers can potentially affect each others' query performances since they are allowed to run arbitrary queries.
Trying to solve it at the ORM level is almost impossible if you are planning to use raw or ad-hoc queries.

score 2 · Answer 2 · answered Nov 28 '17 at 12:15

2

I think you already know what you need to do. The word you are looking for is multitenancy. Although it is not one table per customer. The best suit for you will be one schema per customer. Unfortunately, the best article I had on multitenancy is no more available. See if you can find a cached version: https://msdn.microsoft.com/en-us/library/aa479086.aspx otherwise there are numerous articles availabe on the internet.

Another viable approach is to take a look at custom managers. You could write one custom manager for each Model-Customer and query it accordingly. But all this will lead to application complexity and will soon get out of your hand. Any bug in the application security layer is a nightmare to you.

Weighing both I'd be inclined to say multitenancy solution as you said in your edit is by far the best approach.

answered Nov 28 '17 at 12:15

SomeTypeFoo

886
7
16

I think I'll end up going with Row Level Security on PostgreSQL: https://www.postgresql.org/docs/9.5/static/ddl-rowsecurity.html. This article helped me a lot to bridge the application level users with postgres policies: https://blog.2ndquadrant.com/application-users-vs-row-level-security/ – ertan Nov 28 '17 at 16:16
Interesting read. Thanks for sharing. But there is one problem I see. The overhead of creating new policies and maintaining synchronicity between db layer and application layer is not worthy. – SomeTypeFoo Nov 29 '17 at 06:53
1

Creating a policy for each table is an overhead yes, but you only have to do it once. If you write a script to do so and check it into the repo, it doesn't feel so bad. Re: syncing db and application layer. Actually, this is mostly handled by setting the session variable appropriately with one line of code and the rest is handled inside the policies. The only catch is that if this particular field changes via a migration, that might create a security flaw. I'm planning to mitigate this with some integration tests. I feel like it's simpler to maintain compared to multitenancy. – ertan Nov 29 '17 at 18:27
If there is anyway you can skip raw queries and use Django ORM..you can give this a look and see if this fits..https://github.com/bennylope/django-organizations – SomeTypeFoo Nov 30 '17 at 10:53

score -1 · Answer 3 · answered Nov 30 '17 at 07:15

First, you should provide us with more details, how is your architecture set and built, with django so that we can help you. Have you implemented an API? using django template is not really a good idea if you are building a large scale application, consuming a lot of data.Because this can affect the query load massively.I can suggest extracting your front-end from the backend.

Raw query and row level access control over multiple models in Django

3 Answers3