• Stars
    star
    228
  • Rank 175,267 (Top 4 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Safely separate multiple tenants in a Django database

django-scopes

Build status PyPI Python versions PyPI - Django Version

Motivation

Many of us use Django to build multi-tenant applications where every user only ever gets access to a small, separated fraction of the data in our application, while at the same time having some global functionality that makes separate databases per client infeasible. While Django does a great job protecting us from building SQL injection vulnerabilities and similar errors, Django can't protect us from logic errors and one of the most dangerous types of security issues for multi-tenant applications is that we leak data across tenants.

It's so easy to forget that one .filter call and it's hard to catch these errors in both manual and automated testing, since you usually do not have a lot of clients in your development setup. Leaving radical, database-dependent ideas aside, there aren't many approaches available in the ecosystem to prevent these mistakes from happening aside from rigorous code review.

We'd like to propose this module as a flexible line of defense. It is meant to have little impact on your day-to-day work, but act as a safeguard in case you build a faulty query.

Installation

There's nothing required apart from a simple

pip install django-scopes

Compatibility

This library is tested against Python 3.8-3.11 and Django 3.2-4.2.

Usage

Let's assume we have a multi-tenant blog application consisting of the three models Site, Post, and Comment:

from django.db import models

class Site(models.Model):
	name = models.CharField(…)

class Post(models.Model):
	site = models.ForeignKey(Site, …)
	title = models.CharField(…)

class Comment(models.Model):
	post = models.ForeignKey(Post, …)
	text = models.CharField(…)

In this case, our model Site acts as the tenant for the blog posts and their comments, hence our application will probably be full of statements like Post.objects.filter(site=current_site), Comment.objects.filter(post__site=current_site), or more complex when more flexible permission handling is involved. With django-scopes, we encourage you to still write these queries with your custom permission-based filters, but we add a custom model manager that has knowledge about posts and comments being part of a tenant scope:

from django_scopes import ScopedManager

class Post(models.Model):
	site = models.ForeignKey(Site, …)
	title = models.CharField(…)

	objects = ScopedManager(site='site')

class Comment(models.Model):
	post = models.ForeignKey(Post, …)
	text = models.CharField(…)

	objects = ScopedManager(site='post__site')

The keyword argument site defines the name of our scope dimension, while the string 'site' or 'post__site' tells us how we can look up the value for this scope dimension in ORM queries.

You could have multi-dimensional scopes by passing multiple keyword arguments to ScopedManager, e.g. ScopedManager(site='post__site', user='author') if that is relevant to your usecase.

Now, with this custom manager, all queries are banned at first:

>>> Comment.objects.all()
ScopeError: A scope on dimension "site" needs to be active for this query.

The only thing that will work is Comment.objects.none(), which is useful e.g. for Django generic view definitions.

Activate scopes in contexts

You can now use our context manager to specifically allow queries to a specific blogging site, e.g.:

from django_scopes import scope

with scope(site=current_site):
	Comment.objects.all()

This will automatically add a .filter(post__site=current_site) to all of your queries. Again, we recommend that you still write them explicitly, but it is nice to know to have a safeguard.

Of course, you can still explicitly enter a non-scoped context to access all the objects in your system:

with scope(site=None):
	Comment.objects.all()

This also works correctly nested within a previously defined scope. You can also activate multiple values at once:

with scope(site=[site1, site2]):
	Comment.objects.all()

Sounds cumbersome to put those with statements everywhere? Maybe not at all: You probably already have a middleware that determines the site (or tenant, in general) for every request based on URL or logged in user, and you can easily use it there to just automatically wrap it around all your tenant-specific views.

Functions can opt out of this behavior by using

from django_scopes import scopes_disabled


with scopes_disabled():
    …

# OR

@scopes_disabled()
def fun(…):
    …

Please note that django-scopes is also active during migrations, so if you are writing a data migration – or have written one in the past! – you'll have to add appropriate scoping or use the scopes_disabled context.

Custom manager classes

If you were already using a custom manager class, you can pass it to a ScopedManager with the _manager_class keyword like this: from django.db import models

from django.db import models

class SiteManager(models.Manager):

	def get_queryset(self):
		return super().get_queryset().exclude(name__startswith='test')

class Site(models.Model):
	name = models.CharField(…)

	objects = ScopedManager(site='site', _manager_class=SiteManager)

Scoping the User model

Assume you've got two models User and Post. Using the examples above, you can ensure that users only ever see their own diary posts. But how about leaking other users to the currently logged in user? If you application doesn't have much (or any) interaction between users, you can scope the user model. Please note that you'll need a custom user model. Which base classes your user and manager work off will very between projects.

class User(AbstractUser):
	objects = ScopedManager(user='pk', _manager_class=UserManager)

	# (...)

Activating the scope comes with a little caveat - you need to use the users primary key, not the whole object:

with scope(user=request.user.pk):
	# do something :)

Caveats

Locking

With django-scopes, a seemingly innocent query like

Comment.objects.select_for_update().get(pk=3)

could cause unexpected locking across your database, since django-scopes will auto-add one or more JOIN statements to the query, and joined tables will also be locked. One possible fix is of course using scopes_disabled(), around this query. On most modern databases, there's also a way to specify explicitly which tables you want locked:

Comment.objects.select_for_update(of=("self",)).get(pk=3)

You can check if your database supports this feature at runtime using connection.features.has_select_for_update_of.

Admin

django-scopes is not compatible with the django admin out of the box, integration requires a custom middleware. (If you write one, please open a PR to include it in this package!)

Testing

We want to enforce scoping by default to stay safe, which unfortunately breaks the Django test runner as well as pytest-django. For now, we haven't found a better solution than to monkeypatch it:

from django.test import utils
from django_scopes import scopes_disabled

utils.setup_databases = scopes_disabled()(utils.setup_databases)

You can wrap many of your test and fixtures inside scopes_disabled() as well, but we wouldn't advise to do it with all of them: Especially when writing higher-level functional tests, such as tests using Django's test client or tests testing celery tasks, you should make sure that your application code runs as it does in production. Therefore, writing tests for a project using django-scopes often looks like this:

@pytest.mark.django_db
def test_a_view(client):
    with scopes_disabled():
        u = User.objects.create(...)
    client.post('/user/{}/delete'.format(u.pk))
    with scopes_disabled():
    	assert not User.objects.filter(pk=u.pk).exists()

If you want to disable scoping or activate a certain scope whenever a specific fixture is used, you can do so in py.test like this:

@pytest.fixture
def site():
    s = Site.objects.create(...)
    with scope(site=s):
        yield s

When trying to port a project with lots of fixtures, it can be helpful to roll a small py.test plugin in your conftest.py to just globally disable scoping for all fixtures which are not yielding fixtures (like the one above):

@pytest.hookimpl(hookwrapper=True)
def pytest_fixture_setup(fixturedef, request):
    if inspect.isgeneratorfunction(fixturedef.func):
        yield
    else:
        with scopes_disabled():
            yield

ModelForms

When using model forms, Django will automatically generate choice fields on foreign keys and many-to-many fields. This won't work here, so we supply helper field classes SafeModelChoiceField and SafeModelMultipleChoiceField that use an empty queryset instead:

from django.forms import ModelForm
from django_scopes.forms import SafeModelChoiceField

class PostMethodForm(ModelForm):
    class Meta:
        model = Comment
        field_classes = {
            'post': SafeModelChoiceField,
        }

django-filter

We noticed that django-filter also runs some queries when generating filtersets. Currently, our best workaround is this:

from django_scopes import scopes_disabled

with scopes_disabled():
    class CommentFilter(FilterSet):
        …

Uniqueness

One subtle class of bug that can be introduced by adding django-scopes to your project is if you try to generate unique identifiers in your database with a pattern like this:

def generate_unique_value():
    while True:
        key = _generate_random_key()
        if not Model.objects.filter(key=key).exists():
            return key

If you want keys to be unique across tenants, make sure to wrap such functions with scopes_disabled()!

When using a ModelForm (or class based view) to create or update a model, unexpected IntegrityErrors may occur. ModelForms perform a uniqueness check before actually saving the model. If that check runs in a scoped context, it cannot find conflicting instances, leading to an IntegrityErrors once the actual .save() happens. To combat this, wrap the call in scopes_disabled().

class Site(models.Model):
    name = models.CharField(unique=True, …)

    # (...)

    def validate_unique(self, *args, **kwargs):
        with scopes_disabled():
            super().validate_unique(*args, **kwargs)

Further reading

If you'd like to read more about the practical use of django-scopes, there is a blog post about its introduction in the pretix project.

Here is a guide on how to write a shell_scoped django-admin command to provide a scoped Django shell.

More Repositories

1

python-fints

Pure-python FinTS (formerly known as HBCI) implementation
Python
325
star
2

python-sepaxml

SEPA Direct Debit XML generation in python
Python
111
star
3

msrtool

CLI tool to use the MSR605 magstripe reader/writer
Python
54
star
4

android-barcode-keyboard

Barcode keyboard for Android (not actively maintained)
Kotlin
42
star
5

django-i18nfield

Store internationalized strings in Django models with full forms support
Python
36
star
6

django-jsonfallback

Use PostgreSQL's and MySQL's JSONField, but fall back to a TextField implementation on other databases
Python
24
star
7

snakelet

Minimal snake game in JavaScript
JavaScript
18
star
8

django-hierarkey

Flexible and powerful hierarchical key-value store for your Django models
Python
18
star
9

defusedcsv

Python library to protect your users from Excel injections in CSV-format exports, drop-in replacement for standard library's csv module
Python
14
star
10

instantgallery

Beautiful photo gallery for the web in one Unix command line. Generates static html and javscript files.
Python
13
star
11

php-benchmark

time benchmarking for PHP5 code
PHP
10
star
12

facelock

This is how I lock my laptop screen whenever I'm around security people.
Python
8
star
13

django-debug-request

Python
6
star
14

django-i18n-demoapp

Demo code for my talk at DjangoCon Europe 2017 / DjangoCon US 2018
Python
6
star
15

cockatiel

Simple replicating file storage
Python
4
star
16

lendlist

Never again forget what you borrowed from someone and keep track of what your friends borrowed from you (Android app).
Java
4
star
17

ba-dynfil

Code used to obtain the results for my Bachelor's thesis
Python
4
star
18

geeksqlite

GTK sqlite3 database browser
Python
4
star
19

confluence-scraper

Download all content from Confluence through the API
Python
4
star
20

mathdown

Proof of Concept: Android editor for markdown with integration of LaTeX formulas. Made for lecture notes of science students.
3
star
21

logmylife

Logs are awesome!
Python
3
star
22

django-cockatiel

Django storage backend for the cockatiel distributed file storage
Python
3
star
23

WLANalyse

A wardriver's statistic tool, written for Ubuntu linux (only for statistical use, no GPS implemented)
PHP
2
star
24

pysurfstick

Building a python interface for using UMTS sticks (especially Huawei 1550)
Python
2
star
25

pybob

Python3 interface for boblightd (LED controller)
Python
2
star
26

bottr

Ein Jabber-Bot. Der beste Jabber-Bot. Naja, immerhin sein Core.
PHP
2
star
27

reader

[deprecated] Rich web-based feed reader.
PHP
1
star
28

conwaycli

Implementations of Conway's Game of Life with RANDOM figures on cli. Yes, I was bored when I created this!
Python
1
star
29

hangman

[DE] a very special hangman implementation for 2 devices
Python
1
star
30

sudoku

BackTrack-Algorithmus zum Sudoku-LΓΆsen, mit Konsolen- und GTK-Interface; implementiert in Python
Python
1
star
31

espeak.py

Python interface for eSpeak speech synthesizer.
Python
1
star
32

postgres_integer_to_bigint

Parses a SQL script generated by pg_dump and replaces all integer primary keys of name "id" with bigints, taking care of all of their foreign keys as well.
Python
1
star