I have the following pandas dataframe:
DB Table Column Format
Retail Orders ID INTEGER
Retail Orders Place STRING
Dept Sales ID INTEGER
Dept Sales Name STRING
I want to loop on the Tables, while generating a SQL for creating the tables. e.g.
create table Retail.Orders ( ID INTEGER, Place STRING)
create table Dept.Sales ( ID INTEGER, Name STRING)
What I've already done is get distinct db & tables using drop_duplicate
and then for each table apply a filter and concatenate the strings to create a sql.
def generate_tables(df_cols):
tables = df_cols.drop_duplicates(subset=[KEY_DB, KEY_TABLE])[[KEY_DB, KEY_TABLE]]
for index, row in tables.iterrows():
db = row[KEY_DB]
table = row[KEY_TABLE]
print("DB: " + db)
print("Table: " + table)
sql = "CREATE TABLE " + db + "." + table + " ("
cols = df_cols.loc[(df_cols[KEY_DB] == db) & (df_cols[KEY_TABLE] == table)]
for index, col in cols.iterrows():
sql += col[KEY_COLUMN] + " " + col[KEY_FORMAT] + ", "
sql += ")"
print(sql)
Is there a better approach to iterate over the dataframe?