I have millions of rows of GPS data to be copied into a PostgreSQL table.
20110228,235958,QS,15287,121.533272,31.180235,42.3,163,0,1,2011-03-01 00:00:00;
20110228,235958,QS,14715,121.430813,31.199277,36.2,54,1,1,2011-03-01 00:00:00;
20110228,235958,QS,15728,121.400303,31.259683,33,222,0,1,2011-03-01 00:00:00;
20110228,235958,QS,17857,121.525747,31.20636,6.1,152,0,1,2011-03-01 00:00:00;
20110228,235959,QS,10308,121.376895,31.116863,0,325,0,1,2011-03-01 00:00:00;
...
I create a table as follows first
EXECUTE 'CREATE TABLE ' ||taxiDataTableName|| ' (
serial_id SERIAL NOT NULL PRIMARY KEY,
rec_date INT, -- record date
rec_time INT, -- record time
company CHAR(5), -- taxi company name
taxi_id SMALLINT, -- taxi ID
long double precision, -- longitude, E
lat double precision, -- latitude, N
speed REAL, -- taxi speed, km/h
bearing SMALLINT, -- direction of speed
status SMALLINT, -- 0: empty, 1:loaded
infoNo SMALLINT, -- infoNo before time stamp information
info CHAR(25) -- time stamp information
)';
I then run the following command to import data
EXECUTE 'COPY '||taxiDataTableName||' (rec_date,rec_time,company,taxi_id,long,lat,speed,bearing,status,infoNo,info)
FROM '''||datadir||datafilename||''' (DELIMITER('',''))';
Everything works fine until the row 730745
ERROR: value "00155301001555" is out of range for type integer
SQL state: 22003
Context: COPY _20110301_rawdata, line 730745, column rec_time: "00155301001555"
SQL statement "COPY _20110301_rawdata (rec_date,rec_time,company,taxi_id,long,lat,speed,bearing,status,infoNo,info)
FROM 'G:/PostgreSQL_Data_Repo/Taxi_GPS_Data/201103/20110301.txt' (DELIMITER(','))"
PL/pgSQL function load_data_to_table(integer[]) line 68 at EXECUTE
Having checked the data, the row 730745 is
20110301,00155301001555,QS,14834,121.684833,31.040167,0,177,0,1,2011-03-01 00:15:55;
It's obvious that the record time 00155301001555
(2nd column) is erroneous. I think I could change the type of rec_time to BIGINT
and import data regardless of the erroneous data and filter outliers, but this may make the database file size larger, so can PostgreSQL set the out-of-range value (violating the specified data type in table creation) to null?