2

I have a requirement to process a file in node to extract some text from that file, my issue is since i am new to node, i don't understand how to pass the path to the file. I am using cloud Function for Firebase so there is no server, hence no directory for files. Is there a workaround? Like using url links instead?

Here's my Node JS code:

exports.extractTextFromPDF = functions
.https.onCall((data, context) => {
const bucket = firebase.storage().bucket()
const file = bucket.file(data.pathLink) //data.pathlink is 'my-pdf.pdf' which is a file inside my storage
return file.download()
  .then(data => {
    return pdfParse(data[0])
  })
  .then(data => {
    file.delete()
    .then(() => {
      return data.text
    })
    .catch(err => console.log(err))
  })
  .catch(err => console.log(err))
})

I understand i can just pass the path to a file in my server, but i have no server! Can i use a url link instead?

If that's not possible, is it possible alternatively to upload a file on the front end and pass that file in node?

I've tried a number of things:

  1. I've tried passing a url link instead of the path to file - doesn't work

  2. I've tried passing the firebase storage bucket path as a path to file - doesn't work

  3. I've tried uploading a file from the front end and passing it to node as the file path - doesn't work either

Renaud Tarnec
  • 79,263
  • 10
  • 95
  • 121

1 Answers1

1

As shown in this official Cloud Function sample, you can use the temporary directory of the Cloud Function as a local disk storage.

As explained in the doc, you should delete temporary files before ending the Cloud Function:

Local disk storage in the temporary directory is an in-memory filesystem. Files that you write consume memory available to your function, and sometimes persist between invocations. Failing to explicitly delete these files may eventually lead to an out-of-memory error and a subsequent cold start.


The doc also draws our attention on the fact that we should use platform/OS-independent methods to construct file paths. By following the approach presented in the sample you will follow this recommendation.

Renaud Tarnec
  • 79,263
  • 10
  • 95
  • 121
  • That seems like a good lead, but is the alternative of using an uploaded file possible here? – Scorpion Edge Jan 05 '23 at 17:43
  • One alternative would be to use Cloud Storage. You upload your file to Cloud Storage and then call the Cloud Function by passing the file path. Then in the Cloud Function you read the file from Cloud Storage. This way you avoid the complexity of uploading a file via a Cloud Function as well as avoid the limit of 10MB for the HTTP request size. – Renaud Tarnec Jan 05 '23 at 17:58
  • Yes i've tried doing precisely that, but where do i get the filepath in firebase storage? I tried the gs:// prefix (storage location url) for example: gs://xxx-xxxx.appspot.com/x/x.pdf. But no luck. – Scorpion Edge Jan 05 '23 at 18:02
  • 1
    Since you upload the file you know it's file path. In the Cloud Function you define the bucket (normally the default bucket, i.e. `const bucket = admin.storage().bucket();`) and then the file (`const file = bucket.file('x/x.pdf');`). Then you use one of the methods of the `File` object from the Node.js Cloud Storage API: https://googleapis.dev/nodejs/storage/latest/File.html – Renaud Tarnec Jan 05 '23 at 19:39
  • i tried exactly that with about 5 diff variations but i persistently receive error: "ApiError: No such object: my-app-8de22.appspot.com/x.pdf" even though it clearly is there in my firebase storage. – Scorpion Edge Jan 05 '23 at 21:20
  • Could you share the code of your Cloud Function by editing your question. Without the code it’s nearly impossible to help you. – Renaud Tarnec Jan 05 '23 at 21:22
  • Sure. just edited – Scorpion Edge Jan 05 '23 at 21:27