Pandas dataframe to_sql method with python3

Question

I have a pandas dataframe with many different columns and will like to insert specific columns into a mySQL database.

I understand from the documentation that the way to do so is to use to_sql like so:

import pandas as pd
import mysql.connector
from sqlalchemy import create_engine

engine = create_engine(db)
data.to_sql(name='dbtable', con=engine, if_exists = 'append', index=False)

However, when I run this in python3, I am getting the following ImportError:

Traceback (most recent call last):
  File "myfile.py", line 130, in <module>
    engine = create_engine('mysql+mysqlconnector://user:password@localhost:port/db')
  File "/home/pohzihow/.local/lib/python3.5/site-packages/sqlalchemy/engine/__init__.py", line 387, in create_engine
    return strategy.create(*args, **kwargs)
  File "/home/pohzihow/.local/lib/python3.5/site-packages/sqlalchemy/engine/strategies.py", line 80, in create
    dbapi = dialect_cls.dbapi(**dbapi_args)
  File "/home/pohzihow/.local/lib/python3.5/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 110, in dbapi
    return __import__('MySQLdb')
ImportError: No module named 'MySQLdb'

Upon doing some research I found out that actually mySQLdb does not support python3, but the documentation on pandas says that it requires the SQLAlchemy engine to connect to the database.

Is there a way to output my pandas dataframe into a mySQL database directly using python3?

SQLAlchemy has support for different mySQL DBAPIs, as described [in the docs](http://docs.sqlalchemy.org/en/rel_1_1/dialects/mysql.html). Try using one of the APIs listed there. — DocZerø, May 19 '17 at 06:52
Hi @Kristof, thank you for the comment. I managed to get it to work with the mysql-connector driver. Will you like to post your comment as an answer so that I can accept it? — Poh Zi How, May 19 '17 at 10:01

score 3 · Accepted Answer · answered May 19 '17 at 10:04

3

SQLAlchemy has support for different mySQL DBAPIs, as described in the docs. At the time of writing, these are:

MySQL-Python
PyMySQL
MySQL Connector/Python
CyMySQL
OurSQL
Google Cloud SQL
PyODBC
zxjdbc for Jython

Try using one of the APIs listed instead of mySQLdb.

answered May 19 '17 at 10:04

DocZerø

8,037
11
38
66

2

Did you get this to work in to_sql? Can you be a little more explicit about how you got this working? – ben.dichter Jul 19 '17 at 15:46

score 3 · Answer 2 · answered Jun 02 '18 at 14:38

Expanding on ƘɌỈSƬƠƑ's answer using pymysql as the DBAPI

import pandas as pd
import pymysql
from sqlalchemy import create_engine

engine = create_engine("mysql+pymysql://USER:PASSWORD@HOST/DATABASE")
con = engine.connect()

data.to_sql(name='dbtable', con=engine, if_exists = 'append', index=False)

score 1 · Answer 3 · answered May 19 '17 at 06:50

Use PyMySQL module. It's the latest one and you need to MySQL running in the background of your system. The code goes this way,

import pymysql, pandas as pd
part1 = pd.read_csv("your_file_location)
part2 = part1["Column 1 Name", "Column2 Name"]
#and further code of yours#
for i in your_pandas_dataset:
    sql = "INSERT INTO your_table_name(your_column_attributes) \
            VALUES ('%d', '%s', #and so on#) %\
            (i[1], i[2],#and so on#)"
            self.cursor.execute(sql)
            self.db.commit()

You need to insert data using loops. And the data will be imported into SQL database.

Pandas dataframe to_sql method with python3

3 Answers3