91

DISCLAIMER: This question is similar to the stack overflow question here, but none of those answers work for my problem, as I will explain later.

I'm trying to copy a large table (~40M rows, 100+ columns) in postgres where a lot of the columns are indexed. Currently I use this bit of SQL:

CREATE TABLE <tablename>_copy (LIKE <tablename> INCLUDING ALL);
INSERT INTO <tablename>_copy SELECT * FROM <tablename>;

This method has two issues:

  1. It adds the indices before data ingest, so it will take much longer than creating the table without indices and then indexing after copying all of the data.
  2. This doesn't copy `SERIAL' style columns properly. Instead of setting up a new 'counter' on the the new table, it sets the default value of the column in the new table to the counter of the past table, meaning it won't increment as rows are added.

The table size makes indexing a real time issue. It also makes it infeasible to dump to a file to then re-ingest. I also don't have the advantage of a command line. I need to do this in SQL.

What I'd like to do is either straight make an exact copy with some miracle command, or if that's not possible, to copy the table with all contraints but without indices, and make sure they're the constraints 'in spirit' (aka a new counter for a SERIAL column). Then copy all of the data with a SELECT * and then copy over all of the indices.

Sources

  1. Stack Overflow question about database copying: This isn't what I'm asking for for three reasons

    • It uses the command line option pg_dump -t x2 | sed 's/x2/x3/g' | psql and in this setting I don't have access to the command line
    • It creates the indices pre data ingest, which is slow
    • It doesn't update the serial columns correctly as evidence by default nextval('x1_id_seq'::regclass)
  2. Method to reset the sequence value for a postgres table: This is great, but unfortunately it is very manual.

Community
  • 1
  • 1
Erik
  • 6,470
  • 5
  • 36
  • 37
  • 2
    Your question is likely a duplicate of http://stackoverflow.com/questions/198141/copy-a-table-including-indexes-in-postgres – goodside Jul 06 '11 at 20:45
  • I saw that question, there were no satisfactory answers that can actually do what I'm requesting, but this prompts me to make another edit to my post. – Erik Jul 06 '11 at 21:02
  • There are three major problems with the most up-voted solution on that page. One, they use command line functions in `pg_dump -t x2 | sed 's/x2/x3/g' | psql` which I don't have access too. Two, It creates the indices before adding the data which will be very slow! Three, the SERIAL's default parameter still references the first table `default nextval('x1_id_seq'::regclass).` These are three flaws I already pointed out in my question. You're telling me there's no solution to any of these? @peter – Erik Jul 07 '11 at 11:37
  • It's only a Small Matter of Programming. – Peter Eisentraut Jul 07 '11 at 12:42

7 Answers7

109

The create table as feature in PostgreSQL may now be the answer the OP was looking for.

https://www.postgresql.org/docs/9.5/static/sql-createtableas.html

create table my_table_copy as
  select * from my_table

This will create an identical table with the data.

Adding with no data will copy the schema without the data.

create table my_table_copy as
  select * from my_table
with no data

This will create the table with all the data, but without indexes and triggers etc.


create table my_table_copy (like my_table including all)

The create table like syntax will include all triggers, indexes, constraints, etc. But not include data.

Eric Leschinski
  • 146,994
  • 96
  • 417
  • 335
Phill
  • 18,398
  • 7
  • 62
  • 102
  • 7
    I posted this question so long ago that I have no easy means of verifying this easily. However, it doesn't seem like the "create table as" copies other objects associated with the table, like indices and sequences. – Erik Jun 09 '16 at 15:02
  • Ah you're right @Erik. What a shame. I'll leave my answer incase someone else finds it helpful. But added a note to say it wont copy the other information. Thanks. – Phill Jun 09 '16 at 16:19
  • 1
    Indeed it has been very helpful in indicating the limitations. Now we know what to look out for when using this method. Thanks! – frostymarvelous Apr 14 '17 at 09:51
  • `(like my_table including all)` may not satisfy the OP who didn't want indexes, but it is perfect for me trying to get a table with all constraints. – Noumenon May 17 '18 at 17:17
73

Well, you're gonna have to do some of this stuff by hand, unfortunately. But it can all be done from something like psql. The first command is simple enough:

select * into newtable from oldtable

This will create newtable with oldtable's data but not indexes. Then you've got to create the indexes and sequences etc on your own. You can get a list of all the indexes on a table with the command:

select indexdef from pg_indexes where tablename='oldtable';

Then run psql -E to access your db and use \d to look at the old table. You can then mangle these two queries to get the info on the sequences:

SELECT c.oid,
  n.nspname,
  c.relname
FROM pg_catalog.pg_class c
     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname ~ '^(oldtable)$'
  AND pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 2, 3;

SELECT a.attname,
  pg_catalog.format_type(a.atttypid, a.atttypmod),
  (SELECT substring(pg_catalog.pg_get_expr(d.adbin, d.adrelid) for 128)
   FROM pg_catalog.pg_attrdef d
   WHERE d.adrelid = a.attrelid AND d.adnum = a.attnum AND a.atthasdef),
  a.attnotnull, a.attnum
FROM pg_catalog.pg_attribute a
WHERE a.attrelid = '74359' AND a.attnum > 0 AND NOT a.attisdropped
ORDER BY a.attnum;

Replace that 74359 above with the oid you get from the previous query.

Scott Marlowe
  • 8,490
  • 3
  • 23
  • 21
  • Note that if you then want the sequences to be dependent on the new parent table you'll have to use "alter sequence seqname owned by newtable.column;" – Scott Marlowe Jul 07 '11 at 16:22
16

The closest "miracle command" is something like

pg_dump -t tablename | sed -r 's/\btablename\b/tablename_copy/' | psql -f -

In particular, this takes care of creating the indexes after loading the table data.

But that doesn't reset the sequences; you will have to script that yourself.

Peter Eisentraut
  • 35,221
  • 12
  • 85
  • 90
  • 1
    Note the important warning below (by Tomek); if your data happen to include the tablename, this command will corrupt your data! For safety, I did a manual search and replace in emacs instead of sed. – CPBL Mar 22 '21 at 11:46
11

To copy a table completely, including both table structure and data, you use the following statement:

CREATE TABLE new_table AS 
TABLE existing_table;

To copy a table structure without data, you add the WITH NO DATA clause to the CREATE TABLE statement as follows:

CREATE TABLE new_table AS 
TABLE existing_table 
WITH NO DATA;

To copy a table with partial data from an existing table, you use the following statement:

CREATE TABLE new_table AS 
SELECT
*
FROM
    existing_table
WHERE
    condition;
K M Rakibul Islam
  • 33,760
  • 12
  • 89
  • 110
  • Can you link to a reference? I couldn't find any information about this syntax on postgresql.org. Some notable questions are a) are the indices preserved? b) what versions of postgres is this valid in? – Erik Jan 23 '19 at 18:31
  • 1
    CREATE TABLE new_table AS TABLE existing_table WITH NO DATA; This will not copy whole structure (like index, trigger, constraints etc) of a table. – Raju Ahmed Nov 11 '19 at 05:26
  • 1
    The sql **table command** is documented that the end of the **select command** page in https://www.postgresql.org/docs/13/sql-select.html. `table x` is shorthand for `select * from x`. **create table** and **create table as** are distinct documentation pages with differing syntax. See https://www.postgresql.org/docs/13/sql-createtable.html and https://www.postgresql.org/docs/13/sql-createtableas.html. The above queries only get the column type with/without data but are nonetheless are useful. – chipfall Jan 24 '22 at 18:21
3

WARNING:

All the answers which use pg_dump and any sort of regular expression to replace the name of the source table are really dangerous. What if your data contains the substring that you are trying to replace? You will end up changing your data!

I propose a two-pass solution:

  1. eliminate data lines from the dump using some data-specific regexp
  2. perform search-and-replace on the remaining lines

Here's an example written in Ruby:

ruby -pe 'gsub(/(members?)/, "\\1_copy_20130320") unless $_ =~ /^\d+\t.*(?:t|f)$/' < members-production-20130320.sql > copy_members_table-20130320.sql

In the above I am trying to copy "members" table into "members_copy_20130320". My data-specific regexp is /^\d+\t.*(?:t|f)$/

The above type of solution works for me. Caveat emptor...

edit:

OK, here's another way in pseudo-shell syntax for the regexp-averse people:

  1. pg_dump -s -t mytable mydb > mytable_schema.sql
  2. search-and-replace table name in mytable_schema.sql > mytable_copy_schema.sql
  3. psql -f mytable_copy_schema.sql mydb

  4. pg_dump -a -t mytable mydb > mytable_data.sql

  5. replace "mytable" in the few SQL statement preceding the data section
  6. psql -f mytable_data.sql mydb
Tomek
  • 1,170
  • 11
  • 10
1
create table newTableName (like   oldTableName including indexes); 
insert into newTableName  select * from oldTableName

This worked for me 9.3

Akshat Zala
  • 710
  • 1
  • 8
  • 23
0

Apparently you want to "rebuild" a table. If you only want to rebuild a table, not copy it, then you should use CLUSTER instead.

SELECT count(*) FROM table; -- make a seq scan to make sure the table is at least
                            -- decently cached
CLUSTER someindex ON table;

You get to choose the index, try to pick one that suits your queries. You can always use the primary key if no other index is suitable.

If your table is too large to be cached, CLUSTER can be slow though.

bobflux
  • 11,123
  • 3
  • 27
  • 27
  • I do actually want to copy, I removed the extra code that wasn't really relavent to the question. From what I can tell, CLUSTER just reorders the rows based on the index, which is not really what I'm looking for. Sorry for the misinformation. – Erik Jul 06 '11 at 20:37