I've been trying various methods to bulk upSert an Azure SQL (MSSQL) database using SQLAlchemy 2.0, the source table is fairly large 2M records and I need to bulk upSert 100,000 records (most of which won't be there).
NOTE This will run as an Azure function so if there is a better way I'm open to this
class issues(Base):
__tablename__ = "issues"
id = mapped_column('id', String(36), primary_key=True)
created = mapped_column ('created', DateTime())
updated = mapped_column ('updated', DateTime())
status = mapped_column('status', String(50))
severity = mapped_column('severity', String(10))
control_id = mapped_column('control_id', String(36))
entity_id = mapped_column('entity_id', String(36))
Example data
issueList = {
issues( "1234", datetime.now(), datetime.now() , "Test", "Low8", "con123", "ent123"),
issues( "5678", datetime.now(), datetime.now() , "Test", "Low9", "con123", "ent123"),
}
Currently I'm doing session.merge(issue)
but it's slow and doesn't support bulk inserts, I've looked at https://stackoverflow.com/a/69968892/1697288 but have been getting errors as I was passing:
issueList = {
"1234": { id: "1234", "created": datetime.now(), "updated": datetime.now, "status": "Test", "severity": "Low16", "control_id": "con123", "entity_id": "ent123" },
"5678": { id: "5678", "created": datetime.now(), "updated": datetime.now, "status": "Test", "severity": "Low9", "control_id": "con123", "entity_id": "ent123" },
}
upsert_data (session, issueList, "issues", "id")
It seems to be expecting a model not text for the 3rd params, so I wasn't sure what to send.
Any suggestions of a fast model would be great. Only this application will be inserting data so locking the db isn't an issue as long as the lock is cleared on error.
Thanks.