1

I am building a Django web app which will essentially serve static data to the users. By static, I mean that admins will be able to upload new datasets but no data entries will be made by users. Effectively, once the data is uploaded, it will be read-only on request by a user.

Given that these are quite large datasets (200k+ rows), I figured that SQL would be the best way to store the data - this avoids reading large datasets into memory (as you'd have to with a pickle or json?). This has the added bonus of using Django models to access the data.

However, I am not sure of the best way to do this, or if there is a better alternative to SQL. I currently have an admin page that allows you to upload .xlsx files which are then parsed and added as model entries row-by-row. It takes FOREVER (30+ minutes for 100K rows). Perhaps I should be creating a whole new db outside of Django and then importing that somehow, but I can't find much documentation on how this could/should be done. Any ideas would be greatly appreciated! Thanks in advance for any wisdom.

Cameron Hyde
  • 118
  • 7

1 Answers1

1

You can try to use .csv file format instead of .xlsx. Python has libraries that allow you to easily write to an sql database using .csv format (comma separated value). This answer could be of further assistance. I hope you find what you're looking for and happy coding!

rawplutonium
  • 386
  • 5
  • 15