66

I have a Django application that uses a Postgres database. I need to be able to backup and restore the db, both to ensure no data is lost and to be able to copy data from the production server to the development server during testing.

There seem to be a few different ways to do this:

  1. Just interact with the db directly. So, for Postgres I might write a script using pg_dumpall and psql.

  2. Use the sqlclear/sqlall commands that come with Django.

  3. Use the dumpdata/loaddata commands that come with Django. So create new fixtures from the db you want to backup and then load them into the db you want to restore.

  4. Use a Django plugin like django-dbbackup.

I really don't understand the pros/cons of these different techniques.

Just off the top of my head: Option 1 is database-specific and option 3 seems more suited to setting up initial data. But I'm still not sure what advantages option 4 has over option 2.

dusk
  • 1,799
  • 18
  • 25
trubliphone
  • 4,132
  • 3
  • 42
  • 66
  • why dont you just create a copy of the entire database ? http://www.postgresql.org/docs/8.1/static/backup.html#BACKUP-DUMP – karthikr Jan 10 '14 at 16:29
  • 3
    does django-dbbackup even work? I clearly see code there, that hasn't got a chance to work: https://bitbucket.org/mjs7231/django-dbbackup/src/4702d2cf91987fd8a4122b95afca5d42cd477d00/dbbackup/storage/s3_storage.py?at=default#cl-56 – vartec Jan 10 '14 at 16:30
  • @karthikr - That would work but the commands are specific to Postgres; if the underlying db changes, I would have to rewrite the script. – trubliphone Jan 10 '14 at 16:32
  • @vartec - I haven't fully tested it yet. The bit of code you were looking at saves to Amazon S3, I was just going to save to a local file. – trubliphone Jan 10 '14 at 16:33
  • 2
    Fair enough, I'd be cautious of code containing such obvious error thought. Especially for tasks as important, as taking backups. – vartec Jan 10 '14 at 16:52
  • You mention you ended up writing your own scripts - how do they compare to django-dbbackup? and care to share? – Chozabu Jan 07 '16 at 23:11
  • 1
    @Chozabu - The script that I wrote can be found here: http://pastebin.com/3afcrHqe . It assumes a standard Django "settings.py" w/ all the database info. – trubliphone Jan 08 '16 at 03:01
  • @trubliphone Fantastic! seems rather sensible - the backup runs fine, after filling in the couple of lines that need tailoring to my project, but what about restoring? – Chozabu Jan 08 '16 at 10:50
  • 1
    @Chozabu - The restore is very similar: http://pastebin.com/2hbkwsp0 . – trubliphone Jan 08 '16 at 13:20
  • @trubliphone Looks good! Thanks again, upvotes all over :) I'll do some heavy testing on these (django-db backup was giving some UTF-8 related errors on restore) - I was just starting on my own basic bash scripts for this, but your python scripts look much better! – Chozabu Jan 08 '16 at 16:02

2 Answers2

41

The problem with options 1-3 are that media files (anything uploaded through FileField) are not included in the backup. It is possible to separately backup the directory containing the media files. However, because Django doesn't remove files when they are no longer referenced by a FileField, you will inevitably end up with files in the backup that don't need to be there.

That's why I would go with option #4. In particular, I recommend django-archive*. Some of its features include:

  • Dumps the contents of all important models (by default ContentType, Permission, and Session are excluded since they are populated by manage.py migrate) and lets you choose additional models to exclude.

  • Includes media files referenced by FileField and ImageField fields. Note that only the files referenced by rows in the database are included; files left over by deleted rows are ignored.

  • Produces a single archive containing both the database backup and media files.

  • Provides options for customizing the location where archives should be stored, the filename format, and archive type (gz and bz2).

Installation is as simple as adding django_archive to INSTALLED_APPS and setting options in settings.py if needed. Once installed, you can immediately create an archive of your entire database (including media files) by running:

./manage.py archive

* Disclaimer: I am the author of the package

Nathan Osman
  • 71,149
  • 71
  • 256
  • 361
39

For regular backups I'd go for option 1, using PostgreSQL's own native tool, as it is probably the most efficient.

I would argue that option 2 is primarily concerned with creating the tables and loading initial data so is not suitable for backups.

Option 3 can be used for backups and would be particularly useful if you needed to migrate to a different database platform since the data is dumped in a non-SQL form, i.e. JSON understood by Django.

Option 4 the plugin appears to be using db's own backup tools (as per option 1) but additionally provides help to push your backups into cloud storage in Amazon S3 or Dropbox

Anentropic
  • 32,188
  • 12
  • 99
  • 147
  • 1
    I wound up writing my own Python scripts to backup/restore the database. They read from the Django settings module to figure out what type of db it is. Currently, it only supports postgres. But there are hooks for other formats. – trubliphone Jan 17 '14 at 21:49
  • @trubliphone can you upload it to Git or somewhere and provide the script. ? – ajinzrathod May 04 '21 at 09:41
  • @ajinzrathod - I haven't used this code in _years_. But I you can find links to my old scripts in the comments above: https://pastebin.com/3afcrHqe and http://pastebin.com/2hbkwsp0. (Note that these are 5+ years old; Django may have moved on in the meantime.) – trubliphone May 04 '21 at 10:18
  • Oh sorry, I did not see the year. Thanks anyway. – ajinzrathod May 04 '21 at 11:13