32

I have been running a big Rails application for over 2 years and, day by day, my ActiveRecord migration folder has been growing up to over 150 files.

There are very old models, no longer available in the application, still referenced in the migrations. I was thinking to remove them.

What do you think? Do you usually purge old migrations from your codebase?

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Simone Carletti
  • 173,507
  • 49
  • 363
  • 364

10 Answers10

18

The Rails 4 Way page 177: Sebastian says...

A little-known fact is that you can remove old migration files (while still keeping newer ones) to keep the db/migrate folder to a manageable size. You can move the older migrations to a db/archived_migrations folder or something like that. Once you do trim the size of your migrations folder, use the rake db:reset task to (re-)create your database from db/schema.rb and load the seeds into your current environment.

user12121234
  • 2,519
  • 2
  • 25
  • 27
14

Once I hit a major site release, I'll roll the migrations into one and start fresh. I feel dirty once the migration version numbers get up around 75.

  • 7
    How do you roll them into one? Manually? – Alison R. Nov 22 '10 at 19:28
  • 4
    @AlisonR. - not sure if this was true when you asked, but in Rails 3 you can just copy `schema.rb`. – Nathan Long Dec 11 '12 at 18:12
  • 1
    Why roll them into one at all? `schema.rb` should be the canonical source of your database anyway. You can run `rake db:schema:load` to get the latest schema, and then `rake db:migrate` to get the latest migrations. – lobati Jun 11 '14 at 19:19
5

I occasionally purge all migrations, which have already been applied in production and I see at least 2 reasons for this:

  • More manageable folder: it is easier to spot a new migration.
  • Cleaner text search results: global text search within a project does not lead to tons of useless matches because of some 3-year-old migration when someone added or removed some column which anyway does not exist anymore.
Artur INTECH
  • 6,024
  • 2
  • 37
  • 34
4

Personally I like to keep things tidy in the migrations files. I think once you have pushed all your changes into prod you should really look at archiving the migrations. The only difficulty I have faced with this is that when Travis runs it runs a db:migrate, so these are the steps I have used:

  1. Move historic migrations from /db/migrate/ to /db/archive/release-x.y/

  2. Create a new migration file manually using the version number from the last run migration in the /db/archive/release-x.y directory and change the description to something like from_previous_version. Using the old version number means that it won't run on your prod machine and mess up.

  3. Copy the schema.rb contents from inside the ActiveRecord::Schema.define(version: 20141010044951) do section and paste into the change method of your from_previous_version changelog

  4. Check all that in and Robert should be your parent's brother.

The only other consideration would be if your migrations create any data (my test scenarios contain all their own data so I don't have this issue)

FuzzyJulz
  • 2,714
  • 1
  • 16
  • 18
4

They are relatively small, so I would choose to keep them, just for the record.

You should write your migrations without referencing models, or other parts of application, because they'll come back to you haunting ;)

Check out these guidelines:

http://guides.rubyonrails.org/migrations.html#using-models-in-your-migrations

samuil
  • 5,001
  • 1
  • 37
  • 44
  • 1
    I tried writing migrations without model references for a while and it turned out to be a pretty big headache, especially if I was trying to add new database constraints. I'd have to write a rake task to first clean up the data, then later push a migration to add the constraint, because you can't run a rake task when there are migrations that haven't been run. So much easier to just put it all in the migration. Added bonus is it's transactional implicitly, so failures roll back. – lobati Jun 11 '14 at 19:38
  • @lobati: I don't get the connection between using your models in your migrations and new database constraints. I use real FKs, CHECKs, triggers, ... all over the place and manage them using SQL in migrations, no models needed in my migrations. – mu is too short Jun 11 '14 at 20:22
  • @muistooshort I find it's nice to be able to rely on the model level validations and query conveniences when adding constraints or moving data. For example, when we are adding a FK constraint, we sometimes find that the associated record is missing, so we might want to conditionally reconstitute it somehow or just destroy it. Much of this can probably be handled with raw SQL, but I also find that it's nice to have the extra layer of abstraction in case I fat-finger a query. – lobati Jun 11 '14 at 23:04
  • @lobati: I tend to view the database as a wholly separate application whose API is SQL. That makes the database responsible for its consistency and all the model validations are a backup (for consistency and integrity) that make communicating errors to the outside world a bit easier. The leads to some duplication of effort but that's actually a feature. As far as fat-fingering goes, that's why you put your constraints inside the database. Broken code is temporary, broken data is forever. We're probably talking around the same thing in any case. – mu is too short Jun 11 '14 at 23:21
  • @muistooshort I think I agree with all of that. Unfortunately, I haven't always been as wise on these things, and often I end up bringing my old work up to date with my changing philosophy, or the pre-existing codebases and databases of others. Side note: it might be handy if Rails were able to infer validations from your database constraints. – lobati Jun 11 '14 at 23:29
3

Why? Unless there is some kind of problem with disk space, I don't see a good reason for deleting them. I guess if you are absolutely certain that you are never going to roll back anything ever again, than you can. However, it seems like saving a few KB of disk space to do this wouldn't be worth it. Also, if you just want to delete the migrations that refer to old models, you have to look through them all by hand to make sure you don't delete anything that is still used in your app. Lots of effort for little gain, to me.

Jamison Dance
  • 19,896
  • 25
  • 97
  • 99
  • The main reason is because there's no model definition in the /models folder that points to the table, since the model has been completely removed. – Simone Carletti Nov 22 '10 at 18:20
  • It may seem inefficient to have migrations that create then remove a tables, and unless doing a full database build is a time critical process, there's no reason to mess with them. They like hanging around. – Jeremy Nov 22 '10 at 19:12
  • 2
    It does make a lot of sense if you have really a lot of migrations, we have ~1000 already and it now takes a lot of time to run "rake db:migrate" - even if there is one migration to run, because (I think), Rails loads (parses) all of them into memory first. We never purged our migrations yet, but now that I'm here - I'll definitely consider. – Alex Kovshovik Nov 22 '10 at 22:53
  • 1
    One reason: I've just removed Devise from an app (moving to a single-sign-on). Its original migration used things like `t.recoverable`, which doesn't exist without the gem, so I can no longer run that migration at all. At the very least I'd have to comment it out. – Nathan Long Dec 11 '12 at 18:16
3

See http://edgeguides.rubyonrails.org/active_record_migrations.html#schema-dumping-and-you

Migrations are not a representation of the database: either structure.sql or schema.rb is. Migrations are also not a good place for setting/initializing data. db/seeds or a rake task are better for that kind of task.

So what are migrations? In my opinion they are instructions for how to change the database schema - either forwards or backwards (via a rollback). Unless there is a problem, they should be run only in the following cases:

  1. On my local development machine as a way to test the migration itself and write the schema/structure file.
  2. On colleague developer machines as a way to change the schema without dropping the database.
  3. On production machines as a way to change the schema without dropping the database.

Once run they should be irrelevant. Of course mistakes happen, so you definitely want to keep migrations around for a few months in case you need to rollback.

CI environments do not ever need to run migrations. It slows down your CI environment and is error prone (just like the Rails guide says). Since your test environments only have ephemeral data, you should instead be using rake db:setup, which will load from the schema.rb/structure.sql and completely ignore your migration files.

If you're using source control, there is no benefit in keeping old migrations around; they are part of the source history. It might make sense to put them in an archive folder if that's your cup of coffee.

With that all being said, I strongly think it makes sense to purge old migrations, for the following reasons:

  • They could contain code that is so old it will no longer run (like if you removed a model). This creates a trap for other developers who want to run rake db:migrate.
  • They will slow down grep-like tasks and are irrelevant past a certain age.

Why are they irrelevant? Once more for two reasons: the history is stored in your source control and the actual database structure is stored in structure.sql/schema.rb. My rule of thumb is that migrations older than about 12 months are completely irrelevant. I delete them. If there were some reason why I wanted to rollback a migration older than that I'm confident that the database has changed enough in that time to warrant writing a new migration to perform that task.

So how do you get rid of the migrations? These are the steps I follow:

  1. Delete the migration files
  2. Write a rake task to delete their corresponding rows in the schema_migrations table of your database.
  3. Run rake db:migrate to regenerate structure.sql/schema.rb.
  4. Validate that the only thing changed in structure.sql/schema.rb is removed lines corresponding to each of the migrations you deleted.
  5. Deploy, then run the rake task from step 2 on production.
  6. Make sure other developers run the rake task from step 2 on their machines.

The second item is necessary to keep schema/structure accurate, which, again, is the only thing that actually matters here.

ledhed2222
  • 675
  • 6
  • 11
1

It's fine to remove old migrations once you're comfortable they won't be needed. The purpose of migrations is to have a tool for making and rolling back database changes. Once the changes have been made and in production for a couple of months, odds are you're unlikely to need them again. I find that after a while they're just cruft that clutters up your repo, searches, and file navigation.

Some people will run the migrations from scratch to reload their dev database, but that's not really what they're intended for. You can use rake db:schema:load to load the latest schema, and rake db:seed to populate it with seed data. rake db:reset does both for you. If you've got database extensions that can't be dumped to schema.rb then you can use the sql schema format for ActiveRecord and run rake db:structure:load instead.

lobati
  • 9,284
  • 5
  • 40
  • 61
0

I agree, no value in 100+ migrations, the history is a mess, there is no easy way of tracking history on a single table and it adds clutter to your file finding. Simply Muda IMO :)

Here's a 3-step guide to squash all migrations into identical schema as production:

Step1: schema from production

# launch rails console in production
stream = StringIO.new
ActiveRecord::SchemaDumper.dump(ActiveRecord::Base.connection, stream); nil
stream.rewind
puts stream.read

This is copy-pasteable to migrations, minus the obvious header

Step 2: making the migrations without it being run in production

This is important. Use the last migration and change it's name and content. ActiveRecord stors the datetime number in it's schema_migrations table so it knows what it has run and not. Reuse the last and it'll think it has already run.

Example: rename 20161202212203_this_is_the_last_migration -> 20161202212203_schema_of_20161203.rb

And put the schema there.

Step 3: verify and troubleshoot

Locally, rake db:drop, rake db:create, rake db:migrate

Verify that schema is identical. One issue we encountered was datetime "now()" in schema, here's the best solution I could find for that: https://stackoverflow.com/a/40840867/252799

Community
  • 1
  • 1
oma
  • 38,642
  • 11
  • 71
  • 99
0

Yes. I guess if you have completely removed any model and related table also from database, then it is worth to put it in migration. If model reference in migration does not depend on any other thing, then you can delete it. Although that migration is never going to run again as it has already run and even if you don't delete it from existing migration, then whenever you will migrate database fresh, it cause a problem.

So better it to remove that reference from migration. And refactore/minimize migrations to one or two file before big release to live database.

Nimesh Nikum
  • 1,809
  • 13
  • 17