0

Sorry for what is probably a simple question. I've been working with pandas for a little & have had no trouble loading excel files into MSSQL, however, I haven't been able to figure it out for mysql. I wasn't able download the MySQLdb package, so I installed the mysql package & used mysql.connector as shown below:

Original Code

import pandas as pd
import mysql.connector as mysql

data = pd.read_excel(r"[my file path]")
df = pd.DataFrame(data, columns=['Name', 'country', 'age', 'gender'])


conn = mysql.connect('Driver={SQL Server};'
                   'Server=[my server name];'
                   'Database=[my database];'
                   'Trusted_Connection=yes;')

cursor = conn.cursor()

print(df)


for row in df.itertuples():
    cursor.execute('''
                INSERT INTO [database].dbo.[table] (Name, country, age, gender)
                VALUES (?,?,?,?)
                ''',
                   row.Name,
                   row.country,
                   row.age,
                   row.gender
                   )
conn.commit()

Not sure what exactly I should be doing to make this all work.

Code that works

import mysql.connector
import pandas as pd

conn= mysql.connector.connect(user='[username]', password='[password]', host='[hostname]', database='[databasename]')
cursor = conn.cursor()
excel_data = pd.read_excel(r'[filepath]',sep=',', quotechar='\'')

for row in excel_data.iterrows():
    testlist = row[1].values
    cursor.execute("INSERT INTO [tablename](Name, Country, Age, Gender)"
                   " VALUES('%s','%s','%s','%s')" % tuple(testlist))

conn.commit()
cursor.close()
conn.close()

More concise code which also works

import pandas as pd
from sqlalchemy import create_engine, types

engine = create_engine('mysql://root:[password]@localhost/[database]') # enter your password and database names here

df = pd.read_excel(r"[file path]",sep=',',quotechar='\'',encoding='utf8') # Replace Excel_file_name with your excel sheet name
df.to_sql('[table name]',con=engine,index=False,if_exists='append') # Replace Table_name with your sql table name
gg255
  • 1
  • 1
  • 2
  • Hi, could you please explain a bit more about what your current problem is? You have explained clearly what you have done, but it is not clear from what you have written what the symptoms are, what your code is trying to achieve, or what exactly you are unclear about? – ralfe Jul 09 '20 at 14:39
  • I'm trying to load an excel file into a mysql database using pandas, the above code worked fine for MSSQL (minus the import mysql; i used import pyodbc) however I could not figure out how to make it work for mysql. Was looking for help on that front. Am I setting up my connection wrong? Is the syntax wrong? Thanks for the reply! – gg255 Jul 09 '20 at 15:23

2 Answers2

1

Pandas has a number of great functions for dealing with SQL databases. Once you have read from excel, created your df, and connected to your database, you can simply use:

df.to_sql("table_name", conn, if_exists='replace', index=False)

and the columns of your database table will be created automatically. Of course, you can visit the link above and play around with the other parameters. Also, remember to close your database connection with conn.close().

Edit: Figured I'd add how to simply read the dataframe out of the database as well.

df = pd.read_sql("SELECT * FROM table_name", conn)

Hope this helps!

Kel Varnsen
  • 314
  • 2
  • 8
  • Additionally, you can pass the `engine` object from `sqlalchemy.create_engine` into the `con` parameter and not have to fuss with opening/closing a connection. – S3DEV Jul 09 '20 at 14:45
  • Hey thanks for the reply! Does this apply to MySQL as well as MSSQL? When I try the df.to_sql statement (for mssql) I get the following error: pandas.io.sql.DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': ('42S02', "[42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid object name 'sqlite_master'. (208) (SQLExecDirectW); [42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Statement(s) could not be prepared. (8180)") I removed my insert statement & tried your line for this – gg255 Jul 09 '20 at 15:35
  • @gg255 It should work with MySQL as well. I'm not sure about MSSQL, but this works great with SQLite databases. There's a discussion [here](https://stackoverflow.com/questions/16476413/how-to-insert-pandas-dataframe-via-mysqldb-into-database) on how to configure this to work with MySQL databases if you want to check that out. If `pd.to_sql()` doesn't work I would try the `sql.write_frame()` and changing the flavor to 'mssql', though I can't say if this works as I've never dealt with MSSQL databases. – Kel Varnsen Jul 09 '20 at 16:07
1

In addition to using dataframe like already-mentioned, I want to add a bit. I am not completely sure if you can install mysql workbench. If so, it provides you with an easy way of importing csv files into your mysql database. Hope this helps, too!

Cheolsoon Im
  • 716
  • 1
  • 6
  • 11
  • Suggestion: Perhaps include a link to the page where the OP can download the installer. And good suggestion. – S3DEV Jul 09 '20 at 14:43
  • 1
    Thanks! Here it is. https://dev.mysql.com/downloads/workbench/ – Cheolsoon Im Jul 09 '20 at 14:49
  • I actually do have workbench & I know how to import csv files into mysql using it....however I am trying to learn how to do so using python/pandas, rather than workbench itself! Thanks. – gg255 Jul 09 '20 at 15:25
  • Oh, I see. Thanks for the comment! – Cheolsoon Im Jul 09 '20 at 15:28
  • FYI. https://stackoverflow.com/questions/10154633/load-csv-data-into-mysql-in-python – Cheolsoon Im Jul 09 '20 at 15:32
  • Oh thanks that is really helpful. I get this error: MySQLdb._exceptions.OperationalError: (1366, "Incorrect integer value: '%d' for column 'age' at row 1") when I use %s or %d. Any idea what I'm messing up? – gg255 Jul 09 '20 at 15:57
  • Great. :) I will get back to you when I get some idea. – Cheolsoon Im Jul 09 '20 at 17:06