64

I've been brooding over the right/optimal way to create a multitenancy application based on Django.

Some explanation:

  • Application can be used by several tenants (tenant1, tenant2, ...,).

  • All tenant-individual data has to be secured against access of other tenants (and their users).

  • Optionally tenants can create additional custom-fields for application-objects.

  • Of course, underlying hardware limits number of tenants on one "system".

1) Separating each tenant by e.g. sub-domain and using tenant-specific databases in the underlying layer

2) Using some tenant-ID in the model to separate the tenant-data in the database

I am thinking about deployment-processes, performance of the system-parts (web-server(s), database-server(s), working-node(s),...)

What would be the best setup ? Where are the pro's and con's?

What do you think?

yuji
  • 16,695
  • 4
  • 63
  • 64
Oliver Rehburg
  • 643
  • 6
  • 5
  • 3
    If I could up-vote this a hundred times, I would. – Brandon Taylor Aug 25 '11 at 17:07
  • Thx :-) Any ideas or additional aspects, I do not mentioned ? – Oliver Rehburg Aug 25 '11 at 17:11
  • Django has its Sites framework built-in, which I think can be extended to have better multitenant support. There needs to be a way to identify which "site" we need to pull content for from something in the request, instead of hard coding which site in settings.py – Brandon Taylor Aug 25 '11 at 17:24
  • No-sql support can also make multitenant easier but, there's not currently a good way that I'm aware of to tell which shard a particular site's data sits on. – Brandon Taylor Aug 25 '11 at 17:25
  • 1
    After looking into using the Sites framework, I came to the conclusion that its purpose is to make sharing data between different sites easier -- which is exactly the opposite of what you want for a multi-tenant app, for data security reasons. – Nexus Sep 16 '11 at 02:54

3 Answers3

57

We built a multitenancy platform using the following architecture. I hope you can find some useful hints.

  • Each tenant gets sub-domain (t1.example.com)
  • Using url rewriting the requests for the Django application are rewritten to something like example.com/t1
  • All url definitions are prefixed with something like (r'^(?P<tenant_id>[\w\-]+)
  • A middleware processes and consumes the tenant_id and adds it to the request (e.g. request.tenant = 't1')
  • Now you have the current tenant available in each view without specifying the tenant_id argument every view
  • In some cases you don't have the request available. I solved this issue by binding the tenant_id to the current thread (similar to the current language using threading.local )
  • Create decorators (e.g a tenant aware login_required), middlewares or factories to protect views and select the right models
  • Regarding to the databases I used two different scenarios:
    • Setup multiple databases and configure a routing according to current tenant. I used this first but switched to one database after about one year. The reasons were the following:
      • We didn't need a high secure solution to separate the data
      • The different tenants used almost all the same models
      • We had to manage a lot of databases (and didn't built an easy update/migration process)
    • Use one database with some simple mapping tables for i.e. users and different models. To add additional and tenant specific model fields we use model inheritance.

Regarding the environment we use the following setup:

From my point of view this setup has the following pro's and con's:

Pro:

  • One application instance knowing the current tenant
  • Most parts of the project don't have to bother with tenant specific issues
  • Easy solution for sharing entities between all tenants (e.g. messages)

Contra:

  • One quite large database
  • Some very similar tables due to the model inheritance
  • Not secured on the database layer

Of course the best architecture strongly depends on your requirements as number of tenants, the delta of your models, security requirements and so on.

Update: As we reviewed our architecture, I suggest to not rewrite the URL as indicated in point 2-3. I think a better solutions is to put the tenant_id as a Request Header and extract (point 4) the tenant_id out of the request with something like request.META.get('TENANT_ID', None). This way you get neutral URLs and it's much easier to use Django built-in functions (e.g. {% url ...%} or reverse()) or external apps.

Merwan
  • 453
  • 1
  • 5
  • 16
Reto Aebersold
  • 16,306
  • 5
  • 55
  • 74
  • Thanks a lot. For the the designated application clear and secure separation of data is very important. I also think that about 70% of the models are the same for all tenants. – Oliver Rehburg Aug 25 '11 at 18:11
  • 2
    Then I suggest you to use a multiple database setup with automatic routing. I would use [south](http://south.aeracode.org/) to manage the databases as they now support multiple databases. – Reto Aebersold Aug 25 '11 at 18:15
  • Thx. I prefer multiple database-setup actually, cause separation is clearer and shifting tenants to other db-server might be easier... south is already placed on my technology-stack for migrating-tasks... – Oliver Rehburg Aug 26 '11 at 09:03
  • Is this solution still actual? I read in different places that one should not rely on threading.local (anymore). – Thijs van Dien Sep 26 '12 at 00:13
  • How did u did first point (Each tenant gets sub-domain (t1.example.com)).. how to achieve it in local machine and on production server ? – vijay shanker Dec 05 '14 at 06:50
  • @RetoAebersold Do you need to maintain different versions simultaneously? – Qiulang Jul 07 '21 at 08:48
4

Here are some pointers to related discussions:

akaihola
  • 26,309
  • 7
  • 59
  • 69
1

I recommend taking a look at https://github.com/bcarneiro/django-tenant-schemas. It will solve your problems a bit like Reto mentioned, except that it uses postgresql schemas.

Clash
  • 4,896
  • 11
  • 47
  • 67