Following this advice to write pandas dataframe into a SQL database using PyODBC. In a gist, this is the SQL code for table creation.
CREATE TABLE [HumanResources].[DepartmentTest](
[DepartmentID] [smallint] NOT NULL,
[Name] [dbo].[Name] NOT NULL,
[GroupName] [dbo].[Name] NOT NULL
)
GO
and this is how you write records from a pandas dataframe into the table.
import pyodbc
import pandas as pd
# insert data from csv file into dataframe.
# working directory for csv file: type "pwd" in Azure Data Studio or Linux
# working directory in Windows c:\users\username
df = pd.read_csv("c:\\user\\username\department.csv")
# Some other example server values are
# server = 'localhost\sqlexpress' # for a named instance
# server = 'myserver,port' # to specify an alternate port
server = 'yourservername'
database = 'AdventureWorks'
username = 'username'
password = 'yourpassword'
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
cursor = cnxn.cursor()
# Insert Dataframe into SQL Server:
for index, row in df.iterrows():
cursor.execute("INSERT INTO HumanResources.DepartmentTest (DepartmentID,Name,GroupName) values(?,?,?)", row.DepartmentID, row.Name, row.GroupName)
cnxn.commit()
cursor.close()
It works, but the code is extremely rigid with specific column data type and even the column names are hardcoded. To meet my use-case I need a more dynamic scenario where a table would be created on the SQL database based on well formed pandas dataframes and then all rows will be pushed to it. The column names, types and even the table name will only be known at execution time.
This seems like a very standard operation, so are there any readymade pandas/pyodbc function to cater to this? Basically, I am asking for code to write a pandas dataframe to sql (maintaining the column data types, i.e. objects become string, integers become integers etc.) without assuming the column names.