Database migrations in a complex branching system

This si what I would call pandora's box. It's massively complex and needs a lot of rules and education for everybody working on the complete codebase. This includes rules to prioritize merging of branches containing db changes etc... – NDM Oct 06 '14 at 09:49

You say you had "a specific DB/schema for a given branch" - do you mean you had some sort of reference copy of the database somewher, which you could copy to a developer's database? Or every developer had a separate schema for each branch? Or something else? – Tom Anderson Jun 30 '11 at 09:41

I'm not sure why you think having a big bunch of scripts is not being in control. If your team wrote and maintains those scripts, you're in *complete* control. You're not trusting some third-party tool to solve a complicated problem correctly, you're in directly in charge of what's being done. – Tom Anderson Jun 30 '11 at 09:43

Is this a tried and proven technique or something you and your colleagues have just grown into? I'd like to do some more reading and research on it but don't know where to look or what to search for :) – ChrisR Jun 20 '11 at 11:15

It's our breakthrough discovery, which i'm sharing with you for a low, low price. Actually, i have no idea if this approach is widespread outside my company (actually, it's not even universal within my company!). It's not something i've come across in discussions about migrations etc. That would make a good question for [Programmers](http://programmers.stackexchange.com/), actually. – Tom Anderson Jun 20 '11 at 11:34

Mind you, two people liked this idea enough to upvote it, so perhaps they use this approach too? We'll probably never know. – Tom Anderson Jun 20 '11 at 11:35

How is running 'a pile of SQL and XML scripts' against the DB different from incremental migrations? Do you mean that the scripts do not depend upon each other? It sounds essentially the same as an incremental migration system (start from baseline, apply scripts in order). – Luke H Jun 25 '14 at 23:08

Given this some thought, and I think that, if I was using git, I'd be reluctant to adopt anything that made branch changes take longer than a few seconds. Would be interested to hear if @TomAnderson and his team ever found this to be a problem. – Alex Jan 31 '17 at 21:33

@Alex Oh yeah, using this approach to change branches took freaking forever! Initially "long enough that you can go and get a cup of coffee", but later in the project, >10 minutes, because of the amount of test data. So we added a caching layer: after a migration, the tool saves a dump on the shared database server; before a migration, it checks if there's a dump of the target state, and if there is, just loads it. That took the common case back to seconds. – Tom Anderson Feb 04 '17 at 15:00

That said, the slowness of the migrations was because we were loading a lot of test data using a very slow ETL tool which went via our ORM. If the test data had been SQL scripts, the rebuild would still have been really fast, and we wouldn't have needed the caching layer. – Tom Anderson Feb 04 '17 at 15:01

I take it you weren't doing feature branches? We have several devs and they're all working on small feature branches, which when complete, are merged back into master. We can't afford, money or time-wise, to have a separate db for each branch. – Tim Gautier May 22 '18 at 15:29

@TimGautier We weren't doing feature branches, but i don't see how it would be different. We did have separate development and release branches (the latter for urgent bugfixes etc), and it worked fine. Why do you think having a separate database for each branch would cost more time and money? I suspect we are starting with different assumptions here. – Tom Anderson May 29 '18 at 11:16

@TomAnderson Well, feature branches mean we have a lot of branches (10-20 at a time) and are creating new ones all the time (typically several a day). Our database is a couple terabytes and spinning up a new one isn't free or instant. We'd be spending all our time waiting for DBs to come up and the cost would be crazy. I've done the DB per branch thing and it works great, but only if you have a small DB or can tolerate very stale data. – Tim Gautier May 29 '18 at 16:58

@TimGautier Oh wow, if you have a >1 TB database then my method is indeed not a useful one at all! You might be able to use a similar approach by having a pool of prepared databases, depending on how much the database varies across branches. Or having a pool of databases at the baseline, if the migrations from there to the desired state are small. Either way, you would want a background job churning out fresh baseline databases to keep the pool full. – Tom Anderson May 30 '18 at 10:49

You should add more details how these hooks work and how the migration scripts should be written. (Don't only link to the code and explanations because the link might get broken in the future.) – try-catch-finally Nov 21 '15 at 10:52

Database migrations in a complex branching system

5 Answers5

Install

Usage

Linked