2

I have 4 csv files each having 500,000 rows. I am trying to import the csv data into my Exasol databse, but there is an error with the date column and I have a problem with the first unwanted column in the files.

Here is an example CSV file:

unnamed:0 , time, lat, lon, nobs_cloud_day
0,  2006-03-30, 24.125, -119.375, 22.0
1,  2006-03-30, 24.125, -119.125, 25.0

The table I created to import csv to is

CREATE TABLE cloud_coverage_CONUS (
index_cloud DECIMAL(10,0)
,"time" DATE -- PRIMARY KEY 
,lat DECIMAL(10,6)
,lon DECIMAL(10,6)
,nobs_cloud_day DECIMAL (3,1)
)

The command to import is

IMPORT INTO cloud_coverage_CONUS FROM LOCAL CSV FILE 'D:\uni\BI\project 1\AOL_DB_ANALYSIS_TASK1\datasets\cloud\cfc_us_part0.csv';

But I get this error:

SQL Error [42636]: java.sql.SQLException: ETL-3050: [Column=0 Row=0] [Transformation of value='Unnamed: 0' failed - invalid character value for cast; Value: 'Unnamed: 0'] (Session: 1750854753345597339) while executing '/* add path to the 4 csv files, that are in the cloud database folder*/ IMPORT INTO cloud_coverage_CONUS FROM CSV AT 'https://27.1.0.10:59205' FILE 'e12a96a6-a98f-4c0a-963a-e5dad7319fd5' ;'; 04509 java.sql.SQLException: java.net.SocketException: Connection reset by peer: socket write error

Alternatively I use this table (without the first column):

CREATE TABLE cloud_coverage_CONUS (
"time" DATE -- PRIMARY KEY 
,lat DECIMAL(10,6)
,lon DECIMAL(10,6)
,nobs_cloud_day DECIMAL (3,1)
)

And use this import code:

IMPORT INTO cloud_coverage_CONUS FROM LOCAL CSV FILE 'D:\uni\BI\project 1\AOL_DB_ANALYSIS_TASK1\datasets\cloud\cfc_us_part0.csv'(2 FORMAT='YYYY-MM-DD', 3 .. 5);

But I still get this error:

SQL Error [42636]: java.sql.SQLException: ETL-3052: [Column=0 Row=0] [Transformation of value='time' failed - invalid value for YYYY format token; Value: 'time' Format: 'YYYY-MM-DD'] (Session: 1750854753345597339) while executing '/* add path to the 4 csv files, that are in the cloud database folder*/ IMPORT INTO cloud_coverage_CONUS FROM CSV AT 'https://27.1.0.10:60350' FILE '22c64219-cd10-4c35-9e81-018d20146222' (2 FORMAT='YYYY-MM-DD', 3 .. 5);'; 04509 java.sql.SQLException: java.net.SocketException: Connection reset by peer: socket write error

(I actually do want to ignore the first column in the files.)

How can I solve this issue?

Solution:

IMPORT INTO cloud_coverage_CONUS FROM LOCAL CSV FILE 'D:\uni\BI\project 1\AOL_DB_ANALYSIS_TASK1\datasets\cloud\cfc_us_part0.csv' (2 .. 5) ROW SEPARATOR = 'CRLF' COLUMN SEPARATOR = ',' SKIP = 1;

I did not realise that mysql is different from exasol

Shahin
  • 25
  • 3

1 Answers1

1

Looking at the first error message, a few things stand out. First we see this:

[Column=0 Row=0]

This tells us the problem is with the very first value in the file. This brings us to the next thing, where the message even tells us what value was read:

Transformation of value='Unnamed: 0' failed

So it's failing to convert Unnamed: 0. You also provided the table definition, where we see the first column in the table is a decimal type.

This makes sense. Unnamed: 0 is not a decimal. For this to work, the CSV data MUST align with the data types for the columns in the table.

But we also see this looks like a header row. Assuming everything else matches we can fix it by telling the database to skip this first row. I'm not familiar with Exasol, but according to the documentation I believe the correct code will look like this:

IMPORT INTO cloud_coverage_CONUS 
FROM LOCAL CSV FILE 'D:\uni\BI\project 1\AOL_DB_ANALYSIS_TASK1\datasets\cloud\cfc_us_part0.csv'
    (2 FORMAT='YYYY-MM-DD', 3 .. 5)
ROW SEPARATOR = 'CRLF' 
COLUMN SEPARATOR = ','
SKIP = 1;
Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
  • thank you for replying to me. :) how can I ignore the first column I do not want it anyway – Shahin Nov 29 '22 at 17:56
  • @Shahin That depends on what kind of database you're using. The syntax for this is a little different in every database. Right now we only know the question is tagged "mysql", but neither the syntax nor the error message are from MySQL. Until we know what database you're using, we can't help more. In the future, do better tagging your question. If you'd done it right at first, you'd have a solution already. – Joel Coehoorn Nov 29 '22 at 17:58
  • exasol database. I did not know it is different. I apologise – Shahin Nov 29 '22 at 17:59
  • Thank you Joel because you made me realise that exasol is different from mysql I was able to find the solution. :) IMPORT INTO cloud_coverage_CONUS FROM LOCAL CSV FILE 'D:\uni\BI\project 1\AOL_DB_ANALYSIS_TASK1\datasets\cloud\cfc_us_part0.csv' (2 .. 5) ROW SEPARATOR = 'CRLF' COLUMN SEPARATOR = ',' SKIP = 1; – Shahin Nov 29 '22 at 18:08