0

I currently have a Google App Engine python script that reads a firestore. The structure is as follows

<collection name=Location1>
    <document name=timestamp1>
        data-array
    <document name=timestamp2>
        data-array
    <document name=timestamp3>
        data-array
<collection name=Location2>
    <document name=timestamp1>
        data-array
    <document name=timestamp2>
        data-array
    <document name=timestamp3>
        data-array

This is basically a cache. This works best for my appengine as it dynamically generates a webpage so it needs to be fast. It takes location as an input and tries to pull the past 12hrs worth of data from the "cache". If it does not exist, it pulls the missing data from web and writes it into the cache. The issue is that I am not deleting old data and there is no guarantee the location (collection) will ever get called again. I could restructure my data to have timestamp collections with location documents but that makes looking up data difficult.

Therefore, I would like a separate program (cloud function?) to periodically scan my firestore and delete anything more than X hours from each collection or delete the entire collection if all the documents are older/deleted. I realize I probably need to add a timestamp entry to each document so I can query on it.

All I have been able to find regarding how to accomplish this is at https://firebase.google.com/docs/firestore/solutions/delete-collections however, I have having trouble understanding it and it seems to require me to specify the collection.

Doug Stevenson
  • 297,357
  • 32
  • 422
  • 441
eng3
  • 431
  • 3
  • 18

1 Answers1

0

Firestore doesn't have any built-in scheduling capabilities. You'll have to use another product in tandem with it.

If you're already using Google App Engine, you can just set up a cron job and have it execute a program to delete documents. If you prefer Cloud Functions, you can use a Firebase scheduled function, or if you don't use the Firebase tooling, you can put together the same functionality by configuring Cloud Scheduler to invoke an HTTP endpoint of your choice.

See also: Cloud Functions for Firebase trigger on time?

Doug Stevenson
  • 297,357
  • 32
  • 422
  • 441
  • Yes, I agree a cron job or function to schedule it. My question was more related to actually go through the collections programmatically. – eng3 Jun 23 '20 at 21:59
  • Your code will have to query for the documents to delete, iterate the query results, and remove them each individually. – Doug Stevenson Jun 23 '20 at 22:01