I have a possibly peculiar data migration problem:
- I have an existing and populated SQLite3 database.
- I do receive a new schema for a (hopefully compatible) database.
Result should be a new database, built according to new schema, containing as much as possible of the old database content.
Given limitations in both SQLite3 ALTER statement and our workflow it is safe to assume:
- normal case will be new columns are added to end of table.
- added columns (fields) will either have a default or can be left NULL.
- rarely some table will be added.
- very rarely some table or column may be dropped.
- no table/column renaming will happen.
- no column reshuffling will happen.
NOTE: if the new schema is not compatible with the old one (i.e.: any of the above assumptions does not hold true) it's accepted to fail badly.
I tried this script (old database is data.sql3
and new schema is data.schema
):
mkdir tmp
cd tmp
#compute old DB schema
sqlite3 ../data.sql3 .schema >old_s
#purge new schema for any initialization...
grep -v ^INSERT ../data.schema >data.schema
#... create a dew, empty DB...
sqlite3 new.sql3 <data.schema
#... and compute a standard schema
#(this is done to avoid typing differences)
sqlite3 new.sql3 .schema >new_s
#iff the schemas are different
if ! diff -q old_s new_s
then
#save old DB
mv ../data.sql3 .
#dump contents
sqlite3 data.sql3 .dump >old_d
#expunge all statements needed to recreate DB/Tables
#new_d contains only INSERT statements
grep -v -f old_s old_d >new_d
#add old DB content to new DB
sqlite3 new.sql3 <new_d
#move new DB in place
mv new.sql3 ../data.sql3
fi
cd ..
This works to detect changes, but fails to repopulate the new database because .dump
does not contain column names and thus insertion fails (missing value).
What I'm looking for is some way to force sqlite3 DB .dump
to output INSERT
statements containing all field names (normally it relies on position) or, it that's not possible, some way to tell sqlite3 DB <new_d
to consider any undefined field as null
or default (without failing).
Any other way to achieve the same result (without requiring knowledge of what, exactly, has been modified) would be equally welcome.