I have a massive dump of data that I need uploaded through an API. The requests need to be done one record at a time because of data validation reasons.
The API can support up to 1000 records in one POST request, but the validation error is too vague to identify which record has an issue.
I'll spare the unnecessary details. My script is essentially doing this:
for row in reader:
values = {...} # Build dict from row to pass on to request
upload_record(values)
With hundreds of thousands of records, this is extremely slow. I'm currently sitting around 50k after 15 hours.
How can I speed this up. The order of the data doesn't matter, it just needs to get from the csv to the API table as fast as possible.