I am still a newbie in pandas. I have an excel file which contains more than 700 columns with 600 rows. I am trying to split it into multiple dataframes so I can insert it into sql table. The reason why I am trying to split is because I am getting an error saying The statement has been terminated. (3621); [42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Cannot create a row of size 8349 which is greater than the allowable maximum row size of 8060. I want to split it by 250 columns per dataframe and also take the first row so I can join it in the sql.
Another reason why I cannot specify the dataframe range is because, different excel files might have different number of columns. Lets say the excel file contains 800 columns, but I set it as 700: 850, it gives me an error saying iloc giving 'IndexError: single positional indexer is out-of-bounds'.
If there is a way to do it dynamically by splitting it by 250 rows, it would be great. Below is my code which is working but I need to specify the range of index:
def sqlcol(dfparam):
dtypedict = {}
for i,j in zip(dfparam.columns, dfparam.dtypes):
if "object" in str(j):
dtypedict.update({i: sqla.types.NVARCHAR(length=255)})
if "datetime" in str(j):
dtypedict.update({i: sqla.types.DateTime()})
if "float" in str(j):
dtypedict.update({i: sqla.types.Float()})
if "int" in str(j):
dtypedict.update({i: sqla.types.BIGINT()})
if "decimal" in str(j):
dtypedict.update({i: sqla.types.DECIMAL()})
return dtypedict
def import_varchar_to_hst03(db:str,tb_name:str,df):
n= 1
import pandas as pd
import sqlalchemy as sqla
import urllib
import pyodbc
t=sqlcol(df1)
a=sqlcol(df2)
c=sqlcol(df3)
quoted = urllib.parse.quote_plus("DRIVER={ODBC Driver 17 for SQL Server};SERVER=localhost;DATABASE="+db+";Trusted_Connection=yes;")
engine = sqla.create_engine('mssql+pyodbc:///?odbc_connect={}'.format(quoted), fast_executemany = True)
df1.to_sql(tb_name, schema='dbo', con = engine, index=False,dtype=t,if_exists='replace')
df2.to_sql(tb_name + ( "A" * n), schema='dbo', con = engine, index=False,dtype=a,if_exists='replace')
df3.to_sql(tb_name + ( "B" * n), schema='dbo', con = engine, index=False,dtype=c,if_exists='replace')
import pandas as pd
import numpy as np
Ex=pd.read_excel(r'C:\Users\sriram.ramasamy\Desktop\Testsriram.xlsx',sheet_name=None)
for sh,v in Ex.items():
df=pd.DataFrame(v)
df1 = df.iloc[:,:255]
df2 = df.iloc[:,np.r_[0:1,256:500]]
df3 = df.iloc[:,np.r_[0:1,501:700]]
import_varchar_to_hst03('InsightMaster',sh,df)
print('data imported to database')