2

I'm trying to create a new table in a schema that has multiple columns, named via an index.

CREATE TABLE rflux (pk SERIAL PRIMARY KEY NOT NULL);

Now I want to add new columns like col0 FLOAT, col1, col2, col3, .... up to colN.

I know I can do something like

ALTER TABLE rflux add column col0 FLOAT add column col1 FLOAT ... ;

but I don't want to type everything out, since I need to create ~4500 columns. I'm trying to implement this with loops but I can't quite get it working. Does anyone have any ideas? I've tried creating a function to do this ..

create function addColumns()
returns void
as $$
begin
for i in 1..10 loop
alter table rflux add column tmp float;
alter table rflux rename tmp to col||i;
end loop;
return;
end;
$$
language plpgsql;

then do select * from addColumns();

but I get errors when renaming the column to col||i , or even just when I try i. I'm not even sure if this is the best way to do this. How do I add multiple columns where I can increment the name of the column with a counter?

Edit..I know I can't do this with 4500 columns, but what's the solution to this problem if I wanted to do it for 10 columns, say?

Thanks.

havok2063
  • 517
  • 2
  • 9
  • 25
  • [Maximum Columns per Table 250 - 1600 depending on column types](http://www.postgresql.org/about/) – Mike Sherrill 'Cat Recall' Dec 04 '14 at 16:31
  • 4500 Columns?? I think you might want to consider redesigning your schema, not shoehorning your bad design into your DB. To do what you want, though, you'll want to use "dynamic SQL" in a Stored Procedure... but you probably shouldn't. – JNevill Dec 04 '14 at 16:41
  • What I actually want is to store arrays of 4500 elements into the database somehow. I can store it as one column as an float[], or where each element is a separate column. I'm concerned about access, and slicing of the array via SQLAlchemy. As far as I understand, postgres stores arrays as string elements {1,2,3}. When this is read in by SQLAlchemy, it parses this string and converts it to a list of elements. I'll have millions of rows of data, and need to filter on subsets of rows where certain elements of the arrays satisfy a given condition. So I thought the 2nd way would be better. – havok2063 Dec 04 '14 at 18:34
  • If there's a way to store arrays in a single column, but have a fast subset selection mechanism using SQLAlchemy that works for millions of rows, for multiple simultaneous users, then I'm all ears. I'm just getting started in this, so don't know what's all possible. – havok2063 Dec 04 '14 at 18:38

3 Answers3

2

If it can help:

-- VERSION : POSTGRESQL 9.3

-- FICTIVE TABLE #1

CREATE TABLE table_1 ("YEAR" int, "CODE_SP" text, "TOTAL" int);

INSERT INTO table_1 VALUES
(2035, 'TRUC', 2),
(2035, 'MUCH', 4),
(2036, 'FLIC', 7),
(2036, 'LORD', 2),
(2036, 'TRUC', 8),
(2037, 'MUCH', 2),
(2037, 'FLIC', 2),
(2037, 'FLAC', 5),
(2037, 'MORD', 9),
(2038, 'LOOP', 3),
(2038, 'MUCH', 3);

SELECT * FROM table_1;

-- FICTIVE TABLE #2

CREATE TABLE table_2 ("YEAR" int);

INSERT INTO table_2("YEAR")
SELECT serial
FROM generate_series(2035,2038,1) AS serial;

SELECT * FROM table_2;

-- LOOP FOR ADDING COLUMNS ON TABLE #2 

DO
$do$
    DECLARE colnames TEXT;
BEGIN
FOR colnames IN 
    SELECT "CODE_SP"
    FROM table_1
    GROUP BY "CODE_SP"
    ORDER BY "CODE_SP"
LOOP
    EXECUTE 'ALTER TABLE table_2 ADD COLUMN ' || quote_ident(colnames) || ' INT DEFAULT NULL;'; /* careful: in quoted text, the spaces are important */
END LOOP;
END
$do$;

-- LOOP FOR COMPLETTING CELLS ON TABLE #2 

DO
$do$
    DECLARE id_value TEXT;
BEGIN
FOR id_value IN
    SELECT "CODE_SP"
    FROM table_1
    GROUP BY "CODE_SP"
    ORDER BY "CODE_SP"
LOOP
    EXECUTE 'UPDATE table_2 SET ' || quote_ident(id_value) || ' = table_1."TOTAL" FROM table_1 WHERE table_1."CODE_SP" = ' || quote_literal(id_value) || ' AND table_1."YEAR" = table_2."YEAR";'; /* careful: in quoted text, the spaces are important */
END LOOP;
END
$do$;
k.m
  • 65
  • 8
1

One of the additional solutions that I did not find, but created myself.

do $$
declare 
    colnames text[];
    target_array text[]:= array[
                                ['table1','product_id'],
                                ['table2','product_id'],
                                ['table3','product_id']
                                ];  
    
begin
   FOREACH colnames SLICE 1 in  ARRAY target_array
   loop
        execute 'alter table '||colnames[1]||' 
                add  '||colnames[2]||' int
                DEFAULT NULL;';
        raise notice 'table: %, column: %', colnames[1], colnames[2];
   end loop;
end; $$;
0

your design would likely fit better into an array, hstore, or json data type. Adding 4500 columns is a nightmare waiting to happen.

Joe Love
  • 5,594
  • 2
  • 20
  • 32
  • Yeah that's what it sounds like, especially given the column limits in postgresql. Do you have an idea how this affects efficiency of filtering on subsets of these arrays for millions of rows in SQLalchemy, or even in postgres? – havok2063 Dec 04 '14 at 19:23
  • If you're using postgres 9.3, json is pretty good and you can even index the individual data elements within the json object. if you're using a version less than 9.3, I'd go with hstore and those subfields can be indexed as well. With an array, you lose the ability to name your fields and order becomes extremely important, so personally, I'd avoid that scenario. – Joe Love Dec 04 '14 at 21:01
  • Hmmm. Ok. I'm using postgres 9.3 so I'll look into using JSON and hstore as well. I don't know much about them compared to arrays but if they let me easily select out subsets of elements in the columns in queries then I'm all for it. Thanks. – havok2063 Dec 04 '14 at 23:50
  • JSON data is easy to query: select json_column->>'subfield' from mytable – Joe Love Dec 05 '14 at 01:09
  • Well most of my querying will be doing through SQLalchemy so I'll have to see how it works from there, but I imagine it's similar. – havok2063 Dec 05 '14 at 02:46
  • that's the trouble with DOMs-- advanced DB features are rarely as easy to use. – Joe Love Dec 05 '14 at 03:02