I'm currently using Pandas to_sql in order to place a large dataframe into an SQL database. I'm using sqlalchemy in order to connect with the database and part of that process is defining the columns of the database tables.
My question is, when I'm running to_sql on a dataframe, how does it know what column from the dataframe goes into what field in the database? Is it looking at column names in the dataframe and looking for the same fields in the database? Is it the order that the variables are in?
Here's some example code to facilitate discussion:
engine = create_engine('sqlite:///store_data.db')
meta = MetaData()
table_pop = Table('xrf_str_geo_ta4_1511', meta,
Column('TDLINX',Integer, nullable=True, index=True),
Column('GEO_ID',Integer, nullable=True),
Column('PERCINCL', Numeric, nullable=True)
)
meta.create_all(engine)
for df in pd.read_csv(file, chunksize=50000, iterator=True, encoding='utf-8', sep=',')
df.to_sql('table_name', engine, flavor='sqlite', if_exists='append', index=index)
The dataframe in question has 3 columns TDLINX, GEO_ID, and PERCINCL