0

I realise that there is a duplicate here: Execute more than 500 operations at once in Firestore Database, but the accepted answer there uses TypeScript in the accepted answer which doesn't work for me.

I'm fetching some data from a REST API which returns an JSON array of ~4000 objects. I want to save all of these objects into a collection on the Firestore database.

So, I'm trying to run a set of multiple batch updates.

I have some code which tries to link together some link some promises together in a for loop, taking some data from an external source:

exports.getRawData = functions.https.onRequest((req, res) => {
  fetch('https://www.example.com', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/x-www-form-urlencoded'
    },
    body: qs.stringify(body)
  })
  .then(response => response.json())
  .then(data =>
      fetch(`https://www.example.com/Endpoint?StartDate=${req.query.startDate}&EndDate=${req.query.endDate}`, {
        method: 'GET',
        headers: {
          'Content-Type': 'application/json',
          'Authorization': `Bearer ${data.access_token}`
        }
      })
  )
  .then(response => response.json())
  .then(newData => {
    for (let i = 0, p = Promise.resolve(); i < 10; i++) {
        p = p.then(_ => new Promise(resolve =>
          {
                console.log(i);
                const batch = db.batch()
                let sliced = newData.slice(i * 40, (i+1) * 40)
                for (let j = 0; j < sliced.length; j++) {
                  let docRef = db.collection("SessionAttendance").doc()
                  batch.set(docRef, sliced[j])
                }
                batch.commit()
                resolve();
          }
        ));
    }
  })
  .then(() => {return res.send('OK')})
  .catch(err => console.log('Err: ' + err))
})

Weirdly this code doesn't always give the same error. Sometimes it says:

Function execution took 3420 ms, finished with status: 'connection error'

I've read that this error usually happens because I have some unreturned Promises, so perhaps I have some of those.

Also, on some deploys, it returns this error:

Function execution took 60002 ms, finished with status: 'timeout'

And then it just keeps running over and over again.

I've tried quite a few different ways of solving this problem, but none of them seem to work.


I also tried this block of code:

.then(newData => {
    const batches = _.chunk(newData, 20)
            .map(postSnapshots => {
                const writeBatch = db.batch();

                postSnapshots.forEach(post => {
                    const docRef = db.collection("SessionAttendance").doc()
                // console.log('Writing ', post.id, ' in feed ', followerId);
                writeBatch.set(docRef, post);
                });

                return writeBatch.commit();
            });
            return Promise.all(batches);
  })

It gives the same error as above:

Function execution took 3635 ms, finished with status: 'connection error'
Tom Neill
  • 41
  • 7
  • There is nothing about the code in that answer you linked to that wouldn't work in a regular JavaScript project on Cloud Functions. The _ there comes from the lodash module. So, did you actually try it? Does the strategy of splitting into batches still work for you? – Doug Stevenson Mar 06 '18 at 01:41
  • Also, what kind of trigger are you using? Please show the entire code - all that's here is a few chains after a promise that's unseen. – Doug Stevenson Mar 06 '18 at 01:43
  • OK sorry I added the trigger to the code. I tried to use the code in the answer I linked, but the writeBatch function called an async function which I don't think the cloud function environment supports – Tom Neill Mar 06 '18 at 01:50
  • I don't see what you mean. The writeBatch function returned from Firestore.batch is just a regular function that you call to specify how the batch works. That's going to work fine in Cloud Functions. Maybe you should start with that code in your function and say how it specifically *doesn't* work. – Doug Stevenson Mar 06 '18 at 02:00
  • And, as far as I see, you're way over-using promises to the point where it's actually difficult to track what you'rer trying to do. You sholdn't have to `new Promise` or `Promise.resolve` or anything like that. At most, maybe `Promise.all()` to wait for a bunch of batches to commit before sending a response. – Doug Stevenson Mar 06 '18 at 02:03
  • Yeah that wasn't clear, I meant that after that block of code in the answer, there was a "await Promises.all(batches)" which I couldn't use in this function. I tried to return the Promises, but I'm getting the same error. – Tom Neill Mar 06 '18 at 02:19
  • Using logging, how far does the function progress until you get that 'connection error'? You'll need to boil this down to a line of code where something goes wrong. – Doug Stevenson Mar 06 '18 at 02:31
  • When there's a timeout, it runs multiple times and never stops. When there's a "connection error" the function doesn't run at all, it doesn't even log anything when I put a console.log('Function running') in the first line. – Tom Neill Mar 06 '18 at 17:44
  • Then I'd guess the connection error is coming from one of your HTTP transactions, not Firestore. Try breaking the problem down into smaller bits and test each one thoroughly so you're not conflating things. Also learn to use catch() in promise chains to figure out exactly where an error is occurring. – Doug Stevenson Mar 06 '18 at 20:04

3 Answers3

0

As it turns out, the second block of code worked perfectly.

The function returned the "connection error" message immediately after it was deployed. If I wait until 5 minutes after it's deployed that no longer appears.

I'm not sure whether this is supposed to happen or not, but the function does at least work for me now.

Tom Neill
  • 41
  • 7
0

I also got connection error and timeout and it was when I ran the request right after I deployed. If I waited 2 minutes or so it would run fine. It seems it takes them a little bit of time to set up the function before you can run it.

0

Simple way to save any number of documents (depends on memory limit) to firestore using bulkWriter, this way takes more time to execute, but more safe than use Promise.all with batches

await documents.reduce(
   (bulkWriter, document) => void bulkWriter.create(
      baseRefOrDbRoot.collection('collectionName').doc(),
      document,
   ) ?? bulkWriter,
   firestore.bulkWriter(),
).flush();