I have a Cloud Function in Python 3.7 to write/update small documents to Firestore. Each document has an user_id as Document_id, and two fields: a timestamp and a map (a dictionary) with three key-value objects, all of them are very small.
This is the code I'm using to write/update Firestore:
doc_ref = db.collection(u'my_collection').document(user['user_id'])
date_last_seen=datetime.combine(date_last_seen, datetime.min.time())
doc_ref.set({u'map_field': map_value, u'date_last_seen': date_last_seen})
My goal is to call this function one time every day, and write/update ~500K documents. I have tried the following tests, for each one I include the execution time:
Test A: Process the output to 1000 documents. Don't write/update Firestore -> ~ 2 seconds
Test B: Process the output to 1000 documents. Write/update Firestore -> ~ 1 min 3 seconds
Test C: Process the output to 5000 documents. Don't write/update Firestore -> ~ 3 seconds
Test D: Process the output to 5000 documents. Write/update Firestore -> ~ 3 min 12 seconds
My conclusion here: writing/updating Firestore is consuming more than 99% of my compute time.
Question: How to write/update ~500 K documents every day efficiently?