I have tens of csv files with over a hundred columns in each. I need to upload these files to postgres tables so to process them and transfer the data to relational tables. I don't want to process each file manually to extract the column names as this might be a repetitive process. Neither the pgAdmin import tool nor the COPY function processes the first row to create columns of the table. So what would be the best approach to handle this issue?
Asked
Active
Viewed 2,553 times
2
-
did you check http://stackoverflow.com/questions/21018256/can-i-automatically-create-a-table-in-postgresql-from-a-csv-file-with-headers ? – Andrew Wolfe Jun 13 '16 at 15:24
-
@AndrewWolfe I don't remember as it has been a year now. But I am glad to see they are referring to one of my solutions in the 2nd answer given (i used to use mmatt name :) – mehmet Jun 13 '16 at 18:14
1 Answers
0
I don't know if this is a sound method or not (due to heavy use of dynamic sql and the methods used to extract column names), here is what has worked for me:
create or replace function data.csv_to_table (in target_table text, in csv_path text, in col_count integer)
returns void as $$
declare
iter integer; --dummy integer to iterate with
col text; --dummy variable to iterate with
col_first text; --first column label, e.g., top left corner on a csv file or spreadsheet
begin
set schema 'data';
drop table if exists temp_table;
create table temp_table ();
-- add just enough number of columns
for iter in 1..col_count
loop
execute 'alter table temp_table add column col_' || iter || ' varchar;';
end loop;
-- copy the data from csv file
execute 'copy temp_table from ''' || csv_path || ''' with delimiter '',''';
iter := 1;
col_first := (select col_1 from temp_table limit 1);
-- update the column names based on the first row which has the column names
for col in execute 'select unnest(string_to_array(trim(temp_table::text, ''()''), '','')) from temp_table where col_1 = ''' || col_first || ''''
loop
execute 'alter table temp_table rename column col_' || iter || ' to ' || col;
iter := iter + 1;
end loop;
-- delete the columns row
execute 'delete from temp_table where ' || col_first || ' = ''' || col_first || '''';
-- change the temp table name to the parameter given if not blank
if length(target_table) > 0 then
execute 'drop table if exists ' || target_table;
execute 'alter table temp_table rename to ' || target_table;
end if;
end;
$$ language plpgsql;

mehmet
- 7,720
- 5
- 42
- 48
-
i tried adding quote_ident() to the line execute 'delete from temp_table.. but did not work. This is a table with no PK, that might be why.. – mehmet May 11 '15 at 16:56