I am writing into a database in badges of X users at a time, and I want to update the user information (replace a row). I only want to have one row per user.
However, when using the parameter if_exists='replace'
the new badge replaces the old badge (deleting previous -distinct- users).
And if I do 'append'
instead, I will end up with loads of duplicates.
My table has multiple columns as PK, so I am using this definition from krvkir: https://stackoverflow.com/a/31045044/9153261
def to_sql_k(self, frame, name, if_exists='fail', index=False, index_label=None,
schema=None, chunksize=None, dtype=None, **kwargs):
'''Definition to add PK'''
if dtype is not None:
for col, my_type in dtype.items():
if not isinstance(to_instance(my_type), TypeEngine):
raise ValueError('The type of %s is not a SQLAlchemy '
'type ' % col)
table = pd.io.sql.SQLTable(name, self, frame=frame, index=index,
if_exists=if_exists, index_label=index_label,
schema=schema, dtype=dtype, **kwargs)
table.create()
table.insert(chunksize)
And I am using this other definition to store information into DB:
def writting_in_db(df, user_list, engine):
try:
ddtype={
'User': INTEGER,
'C1': INTEGER,
'C2': INTEGER
}
pandas_sql = pd.io.sql.pandasSQL_builder(engine)
to_sql_k(pandas_sql, df, 'TableName', keys=('User', 'C1'),
if_exists='append', dtype=ddtype)
except:
print("Exception, this list of users was not updated into database:\n")
sys.stdout.flush()
print(user_list)
sys.stdout.flush()
Any solution that isn't 'delete ALL the contents of the database before every iteration'?
And ideally that it also isn't, 'querying all the existing users on the DB and iterating to see if they are present on the new df to export'.
Salut,