2

When I use axios to download a PDF file to the client side it always ends up corrupted. I have a MERN stack webapp that will allow a user to click a button to download a PDF file. The button click sends an axios GET request to the Node.js web server. I am using express.js to handle the routes and requests. The express route will use a Google API to download a Google Doc as PDF to the Node.js server, and then I would like Node.js to send that PDF to the client's browser. Everything works, except the download is always corrupted on the client side. (The pdf downloads and opens perfectly to Node.js)

Here is the Axios get request that is sent to Node.js when the user clicks a button. (I am using react-redux for the dispatch function). I have tried setting the responseType to 'blob' and that does not work.

export const downloadDocument = (googleDocId) => (dispatch) => {
  dispatch({
    type: DOCUMENT_DOWNLOADING
  })
  axios.get('/api/documents/download?googleDocId=' + googleDocId, {
      responseType: 'arraybuffer'
    })
    .then(res => {
      dispatch({
        type: DOCUMENT_DOWNLOADED,
        payload: res.data
      })
      const url = window.URL.createObjectURL(new Blob([res.data]), {
        type: 'application/pdf'
      })
      const link = document.createElement('a')
      link.href = url;
      link.setAttribute('download', 'LOR.pdf')
      document.body.appendChild(link)
      link.click()
    })
    .catch(err => {
      dispatch(returnErrors(err.response.data, err.response.status))
      dispatch({
        type: DOCUMENT_DOWNLOAD_FAIL
      })
    })
}

Here is the backend express.js code that is handling the download request:

erouter.get('/download', async (req, res) => {
  const googleDocId = req.query.googleDocId
  if (!googleDocId) {
    return res.status(400).json({
      msg: "googleDocId not specified"
    })
  }

  //Use the Google API to download a PDF from Google Drive
  const { google } = require("googleapis")
  const fs = require('fs')

  const auth = new google.auth.GoogleAuth({
    keyFile: "credentials.json",
    scopes: ["https://www.googleapis.com/auth/drive"]
  })

  //Create client instance for auth
  const client = await auth.getClient()

  // Instance of Google Drive API
  const gDrive = google.drive({
    version: "v3",
    auth: client
  })

  //Download file as PDF from google Drive to node.js server
  await gDrive.files.export({
    fileId: googleDocId,
    mimeType: 'application/pdf'
  }, {
    responseType: 'stream'
  })
    .then(response => {
      return new Promise((resolve, reject) => {
        const filePath = path.join(__dirname, "../../tmp")
        console.log(`writing to ${filePath}`);
        const dest = fs.createWriteStream(filePath);
        let progress = 0

        response.data
          .on('end', () => {
            console.log('Done downloading file.')
            resolve(filePath)
          })
          .on('error', err => {
            console.error('Error downloading file.')
            reject(err)
          })
          .on('data', d => {
            progress += d.length;
            if (process.stdout.isTTY) {
              process.stdout.clearLine()
              process.stdout.cursorTo(0)
              process.stdout.write(`Downloaded ${progress} bytes`)
            }
          })
          .pipe(dest)
      })
    })
    .catch(err => console.log(err))

    //Download file to client browser
    const options = {
        root: path.join(__dirname, "../../tmp"),
        headers: {
            'Content-Type' : 'application/pdf'
        }
    }
    var filename = "LOR.pdf"
    res.sendFile(filename, options)
    //res.download('./tmp/LOR.pdf')
})

What could be going wrong? At some point after using responseType: 'arraybuffer' as recommended by other axios threads online the download did work and downloaded a working PDF once, however the effect is inconsistent and almost always results in a corrupted PDF on the client side browser. Someone please help I am going insane. I have tried using download.js npm packages but that doesn't give me the desired result. I have also tried recommendations from these threads: https://gist.github.com/javilobo8/097c30a233786be52070986d8cdb1743 https://github.com/axios/axios/issues/448 <- this solution in this second link worked once, and then later kept resulting in corrupted PDF downloads

  • Taking a guess but I don't think you want to use `res.download()`. Try [res.sendFile()](https://expressjs.com/en/api.html#res.sendFile) instead – Phil Nov 02 '21 at 22:25
  • changing it to res.sendFile gives this error: throw new TypeError('path must be absolute or specify root to res.sendFile'); TypeError: path must be absolute or specify root to res.sendFile – Alejandro Zapien Nov 02 '21 at 22:28
  • It's not a drop-in replacement for `res.download()`. Did you actually look at the examples in the documentation? – Phil Nov 02 '21 at 22:28
  • Yes I looked at some examples and it looked like it just needed the same path, are there some headers I have to set? – Alejandro Zapien Nov 02 '21 at 22:31
  • After looking at some docs, am I doing this correctly? file is still downloading corrupted const options = { root: './tmp', headers: { 'Content-Type' : 'application/pdf' } } var filename = "LOR.pdf" res.sendFile(filename, options) – Alejandro Zapien Nov 02 '21 at 22:38
  • You don't appear to be understanding the examples in the documentation. You need to use absolute paths. Try `root: path.join(__dirname, "tmp")` – Phil Nov 02 '21 at 22:40
  • I tried doing that, the tmp directory is in the root directory of my project, it Is working and downloading a file, but the pdf is still corrupted and unable to be opened. It is downloading fine to Node.js, then being sent corrupted to the client's browser – Alejandro Zapien Nov 02 '21 at 22:48
  • Please update the code in your question – Phil Nov 02 '21 at 22:53
  • updated, I know the root field name is a little hacky, I just want to solve this and I will clean up later lol – Alejandro Zapien Nov 02 '21 at 23:01
  • 1
    So I'm noticing that the same file on every download is returning a different file size, could that be a cause of the issue? if so, how can I ensure the whole download? – Alejandro Zapien Nov 03 '21 at 00:59
  • Personally, I'd just build up the file data in memory and deliver that, avoiding writing to a temporary file entirely – Phil Nov 03 '21 at 01:01
  • now an even stranger issue, when I turn off the internet connection on my local machine, the download works and properly returns a working PDF! What gives? lol – Alejandro Zapien Nov 03 '21 at 01:15
  • I'm new to web development, how would you recommend I go about that? Ideally, I wouldn't save the PDF to the Node.js server, just fetch it form the Google API and send it straight to the client without saving it on the server and then sending, how would I go about that? – Alejandro Zapien Nov 03 '21 at 01:19
  • Oh, are you wiping out any previous versions of the temp file between attempts? – Phil Nov 03 '21 at 01:25
  • Check out [How do I stream response in express?](https://stackoverflow.com/q/38788721/283366). You can probably directly stream the GDrive response stream to your express response – Phil Nov 03 '21 at 01:34
  • Yeah, earlier I was deleting that tmp file and I found that’s what was allowing it to download properly, however that only worked the first couple of times. And now it’s downloading corrupted regardless of deleting it or saving it under a different name. I will check out that link you sent on directing the file stream directly to the client. – Alejandro Zapien Nov 03 '21 at 02:17

2 Answers2

1

I got it working exactly as I intended to! Thank you Phil for your guidance. The Google API streams the PDF, a connect a pipe to the express res object, and that streams the PDF directly into the res, and I can rebuild it as a blob on the other side. It works now without having to write the PDF to the node.js server at all!

//Stream file as PDF from google Drive and pipe it to express res 
gDrive.files.export({fileId: googleDocId, mimeType: 'application/pdf'}, {responseType: 'stream'})
    .then(response => {
        return new Promise((resolve, reject) => {
            console.log('Beginning download stream')
            let progress = 0
            response.data
                .on('end', () => {
                    console.log('Done downloading file.')
                    resolve()
                })
                .on('error', err => {
                    console.error('Error downloading file.')
                    reject(err)
                })
                .on('data', d => {
                    progress += d.length;
                    if (process.stdout.isTTY) {
                        process.stdout.clearLine()
                        process.stdout.cursorTo(0)
                        process.stdout.write(`Downloaded ${progress} bytes`)
                    }
                })
                .pipe(res) //Pipe data to the express res object
        })
    })
    .catch(err => console.log(err))
-1

try this, it should work

// Your code 
const url =  window.URL.createObjectURL(new File([new Blob([res.data])], "LOR.pdf", {type: 'application/pdf'}))
// Your code
Hachour Fouad
  • 52
  • 1
  • 6
  • I just tried this out, file is still corrupted on download :( – Alejandro Zapien Nov 02 '21 at 22:15
  • Ok, just modify your type response in axios, axios.get("URL", { responseType: 'blob'}) Then use the code above, this time it should work this time, it is all what I have thanks – Hachour Fouad Nov 02 '21 at 22:46
  • I tried this with a combination of Phil's solution and the PDF is still ending up corrupted once it is downloaded to the client's browser – Alejandro Zapien Nov 02 '21 at 22:50
  • is the file downloadable from the browser? I mean when you enter the link to browser bar – Hachour Fouad Nov 02 '21 at 22:55
  • no, entering the link + the google doc ID does not do anything. – Alejandro Zapien Nov 02 '21 at 23:07
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Nov 03 '21 at 03:58