14

Can someone please explain why decoding Base64 giving a broken pdf? I need to find the way how to decode Base64 and get pdf out. When i use this service

https://emn178.github.io/online-tools/base64_decode_file.html

I am able to pass Base64 and get file out without problem.

But when i do same in node.js I am getting empty (broken) file consistently. I tried different packages like: js-base64, atob

and none of them worked, getting same empty file as the result.

Link to my code: https://repl.it/@afiliptsov/FaroffGloriousFormula

Anton
  • 1,344
  • 3
  • 15
  • 32

3 Answers3

26

You get a corrupted PDF, because:

  1. According to the officially documentation, the Base64.decode() function decodes Base64 value to UTF-8 string. As you can see, this is the wrong function, because you need to decode value as binary data.
  2. The Base64.atob() function does exactly what you need, but you make a mistake when saving data, because, according to the officially documentation, by default the fs.writeFile() function saves data as UTF-8, while you want to save binary data.

To properly decode Base64 value and store it as binary data, depending on your needs, you can choose one of the following methods:

require('js-base64').Base64.atob()

Decode the Base64 value using Base64.atob() and specify binary encoding when saving the file. This is useful only if you need to handle binary data. Unlike other methods you must install and load the "js-base64" module.

var bin = Base64.atob(stringToDecode);
// Your code to handle binary data
fs.writeFile('result_binary.pdf', bin, 'binary', error => {
    if (error) {
        throw error;
    } else {
        console.log('binary saved!');
    }
});

Buffer.from

Convert the Base64 value to buffer using Buffer.from() and save it into file without specifying encoding. This is useful only if you need to handle buffer.

var buf = Buffer.from(stringToDecode, 'base64');
// Your code to handle buffer
fs.writeFile('result_buffer.pdf', buf, error => {
    if (error) {
        throw error;
    } else {
        console.log('buffer saved!');
    }
});

The encoding option

If you do not need to read/modify the binary data or the buffer, just specify encoding option when saving file. This method is the simplest one and may be the fastest and most memory efficient.

fs.writeFile('result_base64.pdf', stringToDecode, 'base64', error => {
    if (error) {
        throw error;
    } else {
        console.log('base64 saved!');
    }
});
Victor
  • 5,493
  • 1
  • 27
  • 28
  • Do you know is there is a specific format for .docx? It works fine for pdf, but when i do it with docx it prompts a message and prepose my to restore a file when i am trying to open it. – Anton Jul 11 '19 at 19:46
  • 1
    @Anton For Base64 does not matter what kind of data it decodes. Errors can occur only in two cases: 1) you have a damaged Base64 string, or 2) you make mistakes when storing the decoding result. Since in your case PDF files are saved properly, you should check the first scenario. That is, make sure that your Base64 string is not damaged and it matches the contents of your file. For example, [convert the docx file to Base64](https://base64.guru/converter/encode/file) and compare the result with your string. – Victor Jul 12 '19 at 17:24
5

A related issue for me which was solved by reading @victor 's answer is where an Express.js app get's a bas64 encoded PDF from an API and wants to return it to the client as a 'proper' pdf:

res.set({
    'Content-Disposition' : 'attachment; filename='+ data.fileName,
    'Content-Type': 'application/pdf',
});
res.send(Buffer.from(data.content, 'base64'));
bknights
  • 14,408
  • 2
  • 18
  • 31
3

Simple is the best! Just use fs package to save the base64 string to a file, remember that you have to set base64 for encoding option.

fs.writeFile('result_document.pdf', stringToDecode, 'base64', (error) => {
  if (error) throw error;
  console.log("Doc saved!");
});
hoangdv
  • 15,138
  • 4
  • 27
  • 48