3

Imagine you have the following situation:

for i in xrange(100000):
  account = Account()
  account.foo = i
  account.save

Obviously, the 100,000 INSERT statements executed by Django are going to take some time. It would be nicer to be able to combine all those INSERTs into one big INSERT. Here's the kind of thing I'm hoping I can do:

inserts = []

for i in xrange(100000):
  account = Account()
  account.foo = i
  inserts.append(account.insert_sql)

sql = 'INSERT INTO whatever... ' + ', '.join(inserts)

Is there a way to do this using QuerySet, without manually generating all those INSERT statements?

Jason Swett
  • 43,526
  • 67
  • 220
  • 351
  • Not the same, but possible related question: http://stackoverflow.com/questions/1136106/efficent-way-to-insert-thousands-of-records-into-a-table-sqlite-python-django – Shawn Chin Mar 16 '11 at 16:39

2 Answers2

8

As shown in this related question, one can use @transaction.commit_manually to combine all the .save() operations as a single commit to greatly improve performance.

@transaction.commit_manually
def your_view(request):
    try:
        for i in xrange(100000):
            account = Account()
            account.foo = i
            account.save()   
    except: 
        transaction.rollback()
    else:
        transaction.commit() 

Alternatively, if you're feeling adventurous, have a look at this snippet which implements a manager for bulk inserting. Note that it works only with MySQL, and hasn't been updated in a while so it's hard to tell if it will play nice with newer versions of Django.

Community
  • 1
  • 1
Shawn Chin
  • 84,080
  • 19
  • 162
  • 191
  • 2
    This will make little difference on most engines, except possibly SQLite. You're still instantiating thousands of models and running thousands of separate SQL inserts. – Glenn Maynard Mar 16 '11 at 17:14
  • (More accurately, it depends on the database backend and configuration; it's likely to help, but in my experience using batch `INSERT INTO ... FROM VALUES` matters a lot more.) – Glenn Maynard Mar 16 '11 at 17:23
  • I'm using MySQL InnoDB and MyISAM and it such approach really does make a difference (more than 100 times faster) – Bob Oct 01 '14 at 02:13
0

You could use raw SQL.

Either by Account.objects.raw() or using a django.db.connection objects.

This might not be an option if you want to maintain database agnosticism.

http://docs.djangoproject.com/en/dev/topics/db/sql/

If what you're doing is a one time setup, perhaps using a fixture would be better.

Stephen Paulger
  • 5,204
  • 3
  • 28
  • 46