Where do I store the .sql file that creates my database?

Question

I am trying to develop a web application that, like most of applications do, uses a database. To create the database I write a .sql-file that contains all the changes I make to the database. I don't know why exactly but in the past I always found it hard to empty my database or modify it later on when I find that a change would make sense. Since I am still learning all this database related stuff the first layout of my database will always be changed. To keep track of all those changes I grew the habit to create a this .sql file.

In this file I am always dropping all the tables that are already existing and creating all the tables new. I kind of do this to have a reference of the actual state of my database always on hand. Changes to a file are way easier for me than directly using the command line database tool. The first question would really be: Is this actually a good practice or is there another way of organizing things that I didn't hear of yet?

The real question is: Where do I store this file? Is it good practice to store it in the same git-repository as the actual code? Should I put it in git at all? I also think of git/github as a cloud storage, if my harddrive burns, all my projects will still be there since I got them on github. If I don't have the .sql file in there I would have to set up the database from new.

I don't see any reason not to store this in source control. It is, after all, code used to build an instance of the application. — David, Nov 03 '17 at 19:08
Maybe you're right and I'm just overthinking things... Maybe my mind is too much in the world of microservices all getting their own git-repo and I wouldn't know where the .sql file fits. The database is it's own service. — patsimm, Nov 03 '17 at 19:17
@patsimm FTR, modifying a database schema is notoriously difficult. [That's why it's so expensive.](https://www.red-gate.com/dynamic/purchase/product/sqlsourcecontrol). I'm not entirely convinced this question is on topic. It's more about development methodology than programming. — jpaugh, Nov 03 '17 at 21:37
I'm voting to close this question as off-topic because methodology questions are rare on SO. Maybe this is a better fit for Software Engineering, but the question probably needs some clean-up to avoid getting closed there as well. — jpaugh, Nov 03 '17 at 21:38

score 3 · Accepted Answer · answered Nov 03 '17 at 19:39

The general category for this sort of code is "database migrations." There are tools specialized to these tasks and various web application frameworks may support various different DB migrations tools or they may have their own features.

Probably the most popular tool/suite in this category for Python (models implemented in SQLAlchemy) would be Alembic. Some of the other options include Flyway, Liquibase, and Sqitch.

In all of these cases you manage some abstraction of your database schema (SQL, XML, YAML) and these tools generate the necessary SQL and other code to perform "migrations" from each version of the schema to the next throughout the history of your project. Usually these are generated as increments. They start with initial database schema (possibly completely empty), build and initialize the schema into that version; then build and migrate through each step to arrive that the desired version.

This can become arbitrarily complex if you're making more radical changes to your schema. Adding an extra column to a table without a "NOT NULL" constraint is trivial. Adding new M:N relations through "NOT NULL" junction tables and migrating from some denormalized schema to a higher normal form, for example, can entail intermediary stages where you might need to drop some constraints and tolerate referential integrity violations through some transitional state.

I highly recommend reading these websites, their tutorials, HOWTOs and other documentation to gain a deeper understanding of why these tools exist and how they approach this problem space.

score 0 · Answer 2 · answered Nov 03 '17 at 19:16

Yu are reinventing the wheel, my friend, your solution is versioning the DB Schema. And Yes, the changes should be added to the project files as almost all framework do. I recommend that you read the following question How do you version your database schema?

score 0 · Answer 3 · answered Nov 03 '17 at 19:17

0

Yes you should absolutely keep this code in source control. However, the exact location of the code within the repository is up to you and or your development team or management. Some good options would be an install, setup, SQL, or ddl folder.

answered Nov 03 '17 at 19:17

Sentinel

6,379
1
18
23

score 0 · Answer 4 · answered Nov 03 '17 at 21:37

Yes, what you are doing is correct.

Compare with ruby on rails: a file db/schema.rb contains the whole schema. It is good too have such a complete file. This allows you, for example, to easily bootstrap a new environment (e.g. a new testing environment). This file is obviously never used in production as it would wipe out all your data.

Then there are separate small files db/migrations/20171003_add_name_to_person_table.rb or whatever with incremental changes to the schema - called migrations. Those are used to change existing environments without losing data, with some mechanism to make sure each one is only run once per DB.

At your stage it is perfectly fine to be doing all of this manually. You can try to automate it as needed, later. It is good enough that you noticed that something is going on here.

That stuff must go into your code repository, wherever seems natural. /db, /schema, /etc might be some choices.

Where do I store the .sql file that creates my database?

4 Answers4