The best way I have found to do this is to first export the complete schema using the pg_dump
tool with the -s
flag (to dump schema only, and not data), and then export data separately.
To load your schemas starting from a fresh, empty database, use pg_restore
. It will read the output from pg_dump
and use it to build a database.
When exporting the data, you'll need to classify each table (you can write a helper script to make this easier, or use excel, etc...):
- Tables that you want a subset of data from, based on some condition (ie. a certain value in
person_id
)
- Tables you want to copy in their entirety (like dimension tables, such as
calendar
and company_locations
)
- Tables that you don't need any data from
For (1), you will need to write an appropriate SELECT
query that returns the subset of data that you want to copy. Put those queries in a script and have each one write the result to a separate file, named <schema>.<table>
. Lastly, use the psql
utility to load the data to the test database. psql
has a special \copy
command that makes this easy. It can be used from a terminal like this:
psql --c "\copy schema.table FROM ‘~/dump_data/schema.table’ WITH DELIMITER ‘,’ CSV;"
Use pg_dump
again to take care of all of those falling under (2), using the -t <table name>
flag to dump only the named tables, and -a
to dump only data (no schema). This could also be added to the script for (1) by just adding an unqualified SELECT *
for each table in (2) and loading the data the same way.
Tables falling under (3) were already handled by the initial export.