I have a ruby on rails application hosted on heroku using postgresql as its database. Since the database is getting pretty large, I was wondering if there's a way to download only a specific part of it off of heroku. For example, is it possible to download only one specific table, or download only rows with parent_id == x.
2 Answers
Using the DATABASE_URL
config setting you can use pg_dump
to access your database and use the -t
switch to specify a certain table.
For example, to export the table my_table
into file called db.sql
:
pg_dump -t my_table `heroku config:get DATABASE_URL` > db.sql
If you need to limit the download to certain rows then I don't think pg_dump
will do the job on it's own. You could create another table in your Heroku database to first define the subset of rows that you want to download and then have pg_dump
dump only that table. See this question for some ideas about how to do that: Export specific rows from a PostgreSQL table as INSERT SQL script
-
So if I understand this correctly, I create specific export tables in the Heroku database and then download them with pg_dump? Is there a way I can write some kind of script to do all this? I basically want to be able to run a script that will go on heroku, create the various export_tables I need, and then download them. – Robert May 26 '15 at 06:51
-
@Robert you can write a script to run on Heroku to export the data that you want. You could have the script run off the Heroku Scheduler if you want it to run regularly and have it upload the dumps to S3. If you just want a one-off script its probably best to just write something to run locally using your local pg client as Craig suggests. – Steve May 26 '15 at 11:50
In addition to Steve's quite correct answer, you also have the option of connecting using psql
to the DATABASE_URL
and using \copy
, e.g.
$ psql "$(heroku config:get DATABASE_URL)"
mydb=> \copy mytable TO 'mytable.csv' WITH (FORMAT CSV, HEADER)
mydb=> \copy (SELECT col1, col2 FROM mytable2 WHERE ...) TO 'mytable2_partial.csv' WITH (FORMAT CSV, HEADER)
You can extract whole tables, or the output of arbitrary queries (including joins etc). The table definition (DDL) is not exported this way, but can be dumped with pg_dump --schema-only -t ...
.

- 307,061
- 76
- 688
- 778
-
Is there a way I could put all of this into some kind of script (I will be doing this 'selective backup' often)? And also, since I will be getting multiple tables, is there a way to get them all in some kind of .dump file (instead of .csv) so it's easier to import? – Robert May 26 '15 at 06:58
-
@Robert Sure. Any scripting language will do, if you want to invoke `psql` from the shell. Or you could use the Ruby `pg` gem's `copy` support to `COPY ... TO STDOUT` instead of using `psql` and `\copy`; there are existing answers on how to `COPY` directly from the `Pg` gem in Ruby. – Craig Ringer May 26 '15 at 07:01