0

I have a tab delimited data file with many missing values and I need to import it into a table in mariadb(10.4.5).

I used this command:

load data infile 'c:/path to file/file.txt' into table table_name fields terminated by '\t' lines terminated by '\n' ignore 1 rows;

But I get this error:

SQL Error (1366): Incorrect double value: '' for column db_name.table_name.col_name1 at row 10

When I examine the text data file, col_name1 at row 10 is a missing value - ie. nothing between the two tab delimiters.

I have spent hours trying to solve this issue - I would appreciate any help: Is there any way of the data including importing missing values (empty strings) into the mysql table?

Do I need to pre-process the text file before using LOAD DATA INFILE? And if so, what would be the best way to pre-process?

Akina
  • 39,301
  • 5
  • 14
  • 25
Paul
  • 13
  • 4
  • Does this answer your question? [MySQL load NULL values from CSV data](https://stackoverflow.com/questions/2675323/mysql-load-null-values-from-csv-data) – Georg Richter Jan 21 '20 at 06:06
  • I reviewed this - the difference was that in my case, the import stopped completely rather than replace empty string with zero. I looks though the solution is similar if not identical – Paul Jan 21 '20 at 14:38

1 Answers1

0

Do I need to pre-process the text file before using LOAD DATA INFILE? And if so, what would be the best way to pre-process?

You must do it during the importing. Something like:

LOAD DATA INFILE 'c:/path to file/file.txt' 
INTO TABLE table_name 
FIELDS TERMINATED BY '\t' 
LINES TERMINATED BY '\n' 
IGNORE 1 ROWS
-- the fields which values are set directly, 
-- and intermediate variables for values which must be processed,
-- positioned according to CSV structure
(field1, field2, @variable3, field4, ...)
-- process the values in the variables and set fields values
SET field3 = CASE WHEN @variable3 = '' THEN 0 ELSE @variable3 END;

(field1, field2, @variable3, field4, ...) is the destination of data fields parsed from each line of source CSV file.

I.e. first parsed value from the source line which is currently processed will be assigned directly to the field field1 of destination table. The same with second value and field2.

The third value parsed will be assigned to user-defined local variable @variable3.

The 4th parsed value again will be assigned to the table field. And so on if more data and code is present.

After the whole line parsed due to specification explained above the next processing directive is executed: SET field3 = CASE WHEN @variable3 = '' THEN 0 ELSE @variable3 END.

It is simple. If a value of variable @variable3 was assigned to empty string, then the value 0 is assigned to the field field3 of the record currently parsed, else the value parsed from current line of source file is assigned to this field without modification.

After both lines processed the whole record (all fields which were assigned to some value to) are stored into one new record in destination table by common way (assigning defaults to non-listed fields, checks, triggers...).

After storing the record the next line from CSV is readed, parsed, processed, stored, then the next line ... and so on, until the end of file or some error.

Akina
  • 39,301
  • 5
  • 14
  • 25
  • Many thanks Akina. I am not familiar with the syntax here: I assume (field1, field2, variable3, field4, ...) refer to the ALL field names; what is variable3. Could you illustrate with an example? – Paul Jan 21 '20 at 14:45
  • @Paul Some explaination added. – Akina Jan 21 '20 at 17:03
  • That is great help! Many thanks for the detailed explanation. – Paul Jan 21 '20 at 18:18