2

I have a fairly complex django application that has been in production for over a year.

The application holds data from different customers. The data is obviously in the same table, separated by customer_id.

Recently the client has started to ask questions about data segregation. Since the app is sold on a per user basis and holds sensitive information, customers have been asking if and how we maintain data segregation per customer, and are there any security measures that we take to prevent data leakages (ie. data from one customer being accessed by another customer).

We do our filters in the view endpoints, but eventually a developer in the team might forget to include a filter in his ORM query, and cause a data leakage

So we came up with the idea to implement default filters on our models. Basically whenever a developer writes:

Some_Model.objects.all()

essentially they will execute:

Some_Model.objects.filter(customer_id = request.user.customer_id)

We plan to achieve this by overriding the objects property on each model to point to a manager with a filtered queryset. Something like this:

class Allowed_Some_Model_Manager(models.Manager):
    def get_queryset(self):
        return super(Allowed_Some_Model_Manager, self).get_queryset().filter(
            customer_id = request.user.customer_id
            # the problem is that request.user is not available in models.py
        )

class Some_Model(models.Model):
    name = models.CharField(max_length=50)
    customer = models.ForeignKey(Customer)

    objects = Allowed_Some_Model_Manager()
    all_objects = models.Manager() # use this if we want all objects

However our problem is that request.user is not available in models.py.

I have found several ways to solve this.

Option 1 includes passing the request.user to the manager each time. However since I am dealing with thousands of lines of old code, I don't want to go and change all of our ORM queries.

Option 2, included using threading.local() to set the request.user in the thread local data.

Something like this: https://djangosnippets.org/snippets/2179/

There is a module that seems to be doing this: https://github.com/Alir3z4/django-crequest

However, a lot of people seem to be against this idea... Namely these two discussions:

django get_current_user() middleware - strange error message which goes away if source code is "changed" , which leads to an automatic server restart

Django custom managers - how do I return only objects created by the logged-in user?

So that brings me to Option 3 which I came up with, and I can not find anybody else using it. Use the python builtins module to pass the user from the middleware to the model.

#middleware.py

import builtins
def process_request(self, request):
    if request.user.id:
        builtins.django_user = request.user


#models.py
import builtins

class Allowed_Some_Model_Manager(models.Manager):
    def get_queryset(self):
        if 'django_user' in vars(builtins):
            return super(Allowed_Some_Model_Manager, self).get_queryset().filter(
                customer_id = django_user.customer_id
            )
      else:
          return super(Allowed_Some_Model_Manager, self).get_queryset()

I have tested the code and it is working on my local django server and on Apache with mod_wsgi. But I really want to hear if there are any pitfalls of this approach. I have never used builtins module before, and I am not sure if I understand how it works, and what is the use-case for it.

Martin Taleski
  • 6,033
  • 10
  • 40
  • 78
  • You should **definitely not** use the third approach. The `builtins` module is like any other module. You can set an attribute on it, but it's not gonna be thread-safe, and it'll break as soon as you have two concurrent requests. – knbk Jul 04 '17 at 22:20
  • @knbk, why will the builtin break in concurrent requests? – Martin Taleski Jul 04 '17 at 22:21
  • Stop thinking about "the builtin". It is a simple module, nothing more, nothing less. Setting an attribute on a module is _not thread-safe_. A different request will start before the first one finishes, and now the first request will use the user set by the second request. That will cause data leakage to the wrong customer, which you are trying to avoid in the first place. It's basically option 2 without the thread-safety. – knbk Jul 04 '17 at 22:28

0 Answers0