37

I need to import a csv file with 20 million rows and 2 columns into a database, but when I try to do this with MySQL Workbench's data import wizard it is extremely slow, probably is going to take 1 month to finish, looking at the progress bar.

There has to be some faster way to do this, I hope.

Charlie Brumbaugh
  • 219
  • 1
  • 5
  • 15
BlueStarry
  • 677
  • 1
  • 7
  • 13
  • I wrote [This Answer](http://stackoverflow.com/a/32702768) up for a fella. Took his 8 hr import down to a minute or so. Seems extreme, but that is what he said. – Drew Oct 23 '15 at 07:19
  • This is happening maybe because MySQL workbench is doing `COMMIT` after every insert . – Osama Al-Banna Nov 29 '22 at 17:50

7 Answers7

28

Always use Load Data Infile as a first attempt for huge sets of data.

Mysql Manual page on Load Data Infile.

Wrote up several answers for this question, but for a peer comparison, see this guy's question and my Answer and his time comparisons of Workbench vs Load Data Infile.

Community
  • 1
  • 1
Drew
  • 24,851
  • 10
  • 43
  • 78
  • 14
    If this helps anyone, I still use the import wizard on the first row of data because it has a decent wizard for building the table from the csv. Then use the 'Load Data Infile' command to populate the table. Its a nice hybrid approach. Its a real shame the UI doesn't work this way under the hood. As it is coded now, its pretty worthless as a bulk import. – raider33 May 19 '16 at 02:34
  • Thanks @raider33 for sharing – Drew May 19 '16 at 02:35
  • Beware, this solution will timeout in 30 seconds. Lots of people talking about 'wait_timeout' but at 28800 seconds you would think it would last more than 30 seconds. – Alex Barker Jan 15 '19 at 00:13
4

This is an alternative. Dump your CSV data into sql script, you need to write down some code for this. Basically, your csv data will get converted into similar to below commands
INSERT INTO TABLE_NAME values(1,2),(1,3),....;
now use MySQL shell script and use SOURCE command
mysql> source C:/Users/Desktop/sql scripts/script.sql
your data will get imported faster as compared to direct importing a CSV for millions of record.

Gaurav Lad
  • 1,788
  • 1
  • 16
  • 30
3

If you don't want to write code, I suggest trying another free GUI client like HeidiSQL. It imports CSV/text files much quicker than MySQL Workbench.

malnosna
  • 164
  • 1
  • 5
1

I had a similar issue with MySQL workbench. The alternative I found is Toad for MySQL (https://www.toadworld.com/m/freeware/1469)

It would take me 40min via MySQL admin, connecting to a remote MySQL sever. On the MySQL server itself the upload takes a few minutes. With toad I am able to connect to the remote server and upload in a few minutes. I did try HeidiSQL, but did not find it as friendly to import.

1

Always prefer load data infile for importing datasets, the inconvenient is => you have to create a table structure before importing. Import wizard allows you to create on the fly a new table directly from csv or json.

I think the reason of this slowlyness is : workbench uses python for the import wizard. You can see that in the log of import wizard when an error occurs, it's python's console log.

If you don't wan't to create the structure for any reasons, you can start the process, it will create the table from the csv, then kill the process. Then, delete everything from your table and load data infile. It's a kind of ugly "hack" but it worked for me.

adnls
  • 41
  • 1
1

You may use the following python code to convert the csv file to a sql insert into file:

import csv

def csv_to_sql_insert(csv_filename, table_name):
    with open(csv_filename, 'r') as csv_file:
        csv_reader = csv.DictReader(csv_file)
        columns = csv_reader.fieldnames

        for row in csv_reader:
            values = [f"'{row[col]}'" if isinstance(row[col], str) else str(row[col]) for col in columns]
            sql_insert = f"INSERT INTO {table_name} ({', '.join(columns)}) VALUES ({', '.join(values)});"
            print(sql_insert)

# Replace these values with your actual CSV file and desired table name
csv_filename = 'data.csv'
table_name = 'your_table'

csv_to_sql_insert(csv_filename, table_name)
-2

My Suggestion for very fast Export/Import wizard is to use MySQL for Excel

This is painless and fast, you don't need to work around all the errors and tweaks that you have to to work around LOAD DATA INFILE

Babak
  • 615
  • 6
  • 15