I'm getting data loss when doing a csv import using the Python MySQLdb module. The crazy thing is that I can load the exact same csv using other MySQL clients and it works fine.
- It works perfectly fine when running the exact same command with the exact same csv from sequel pro mysql client
- It works perfectly fine when running the exact same command with the exact same csv from the mysql command line
- It doesn't work (some rows truncated) when loading through python script using mysqldb module.
It's truncating about 10 rows off of my 7019 row csv.
The command I'm calling:
LOAD DATA LOCAL INFILE '/path/to/load.txt' REPLACE INTO TABLE tble_name FIELDS TERMINATED BY ","
When the above command is ran using the native mysql client on linux or sequel pro mysql client on mac it works fine and I get 7019 rows imported.
When the above command is ran using Python's MySQLdb module such as:
dest_cursor.execute( '''LOAD DATA LOCAL INFILE '/path/to/load.txt' REPLACE INTO TABLE tble_name FIELDS TERMINATED BY ","''' )
dest_db.commit()
Most all rows are imported but I get thrown out a slew of
Warning: (1265L, "Data truncated for column '<various_column_names' at row <various_rows>")
When the warnings pop up, it states at row <row_num>
but I'm not seeing that correlate to the row in the csv (I think it's the row it's trying to create on the target table, not the row in the csv) so I can't use that to help troubleshoot.
And sure enough, when it's done, my target table is missing some rows.
Unfortunately with over 7,000 rows in the csv it's hard to tell exactly which line it's choking on for further analysis. When the warnings pop up, it states at row <row_num>
but I'm not seeing that correlate to the row in the csv (I think it's the row it's trying to create on the target table, not the row in the csv) so I can't use that to help troubleshoot.
There are many rows that are null and/or empty spaces but they are importing fine.
The fact that I can import the entire csv using other MySQL clients makes me feel that the MySQLdb module is not configured right or something.
This is Python 2.7 Any help is appreciated. Any ideas on how to get better visibility into which line it's choking up on would be helpful.