Decode Base64 encoded PDF content in browser

Question

We transform HTML to PDF in the backend (PHP) using dompdf. The generated output from dompdf is Base64 encoded with

$output = $dompdf->output();
base64_encode($output);

This Base64 encoded content is saved as a file on the server. When we decode this file content like this:

cat /tmp/55acbaa9600f4 | base64 -D > test.pdf

we get a proper PDF file.

But when we transfer the Base64 content to the client as a string value inside a JSON object (the server provides a RESTful API...):

{
  "file_data": "...the base64 string..."
}

And decode it with atob() and then create a Blob object to download the file later on, the PDF is always "empty"/broken.

$scope.downloadFileData = function(doc) {
  DocumentService.getFileData(doc).then(function(data) {
    var decodedFileData = atob(data.file_data);
    var file = new Blob([decodedFileData], { type: doc.file_type });
    saveAs(file, doc.title + '.' + doc.extension);
  });
};

When we log the decoded content, it seems that the content is "broken", because several symbols are not the same as when we decode the content on the server using base64 -D.

When we encode/decode the content of simple text/plain documents, it's working as expected. But all binary (or not ASCII formats) are not working.

We have searched the web for many hours, but didn't find a solution for this that works for us. Does anyone have the same problem and can provide us with a working solution? Thanks in advance!

This is a example for a on the server Base64 encoded content of a PDF document:

JVBERi0xLjMKMSAwIG9iago8PCAvVHlwZSAvQ2F0YWxvZwovT3V0bGluZXMgMiAwIFIKL1BhZ2VzIDMgMCBSID4+CmVuZG9iagoyIDAgb2JqCjw8IC9UeXBlIC9PdXRsaW5lcyAvQ291bnQgMCA+PgplbmRvYmoKMyAwIG9iago8PCAvVHlwZSAvUGFnZXMKL0tpZHMgWzYgMCBSCl0KL0NvdW50IDEKL1Jlc291cmNlcyA8PAovUHJvY1NldCA0IDAgUgovRm9udCA8PCAKL0YxIDggMCBSCj4+Cj4+Ci9NZWRpYUJveCBbMC4wMDAgMC4wMDAgNjEyLjAwMCA3OTIuMDAwXQogPj4KZW5kb2JqCjQgMCBvYmoKWy9QREYgL1RleHQgXQplbmRvYmoKNSAwIG9iago8PAovQ3JlYXRvciAoRE9NUERGKQovQ3JlYXRpb25EYXRlIChEOjIwMTUwNzIwMTMzMzIzKzAyJzAwJykKL01vZERhdGUgKEQ6MjAxNTA3MjAxMzMzMjMrMDInMDAnKQo+PgplbmRvYmoKNiAwIG9iago8PCAvVHlwZSAvUGFnZQovUGFyZW50IDMgMCBSCi9Db250ZW50cyA3IDAgUgo+PgplbmRvYmoKNyAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZQovTGVuZ3RoIDY2ID4+CnN0cmVhbQp4nOMy0DMwMFBAJovSuZxCFIxN9AwMzRTMDS31DCxNFUJSFPTdDBWMgKIKIWkKCtEaIanFJZqxCiFeCq4hAO4PD0MKZW5kc3RyZWFtCmVuZG9iago4IDAgb2JqCjw8IC9UeXBlIC9Gb250Ci9TdWJ0eXBlIC9UeXBlMQovTmFtZSAvRjEKL0Jhc2VGb250IC9UaW1lcy1Cb2xkCi9FbmNvZGluZyAvV2luQW5zaUVuY29kaW5nCj4+CmVuZG9iagp4cmVmCjAgOQowMDAwMDAwMDAwIDY1NTM1IGYgCjAwMDAwMDAwMDggMDAwMDAgbiAKMDAwMDAwMDA3MyAwMDAwMCBuIAowMDAwMDAwMTE5IDAwMDAwIG4gCjAwMDAwMDAyNzMgMDAwMDAgbiAKMDAwMDAwMDMwMiAwMDAwMCBuIAowMDAwMDAwNDE2IDAwMDAwIG4gCjAwMDAwMDA0NzkgMDAwMDAgbiAKMDAwMDAwMDYxNiAwMDAwMCBuIAp0cmFpbGVyCjw8Ci9TaXplIDkKL1Jvb3QgMSAwIFIKL0luZm8gNSAwIFIKPj4Kc3RhcnR4cmVmCjcyNQolJUVPRgo=

If you atob() this, you don't get the same result as on the console with base64 -D. Why?

Did you find a workaround? i got the same problem. even with external libraries... — tvelop, Oct 20 '15 at 21:55
We still use the browsers btoa() to encode to base64. We store it in this format on the server and the server itself decodes the content and streams it directly to the browser. We were not able to send it back as JSON through our RESTful API and let the browser decode it. Only text formats (MIME type text/plain e.g.) worked, PDF and other non readable formats did not work. If you get a solution, please let me know , because we are not happy with our workaround. — Marcel Härle, Oct 22 '15 at 06:21

score 4 · Answer 1 · answered Jan 30 '20 at 06:24

Your issue looks identical to the one I needed to solve recently.

Here is what worked for me:

const binaryImg = atob(base64String);
const length = binaryImg.length;
const arrayBuffer = new ArrayBuffer(length);
const uintArray = new Uint8Array(arrayBuffer);

for (let i = 0; i < length; i++) {
    uintArray[i] = binaryImg.charCodeAt(i);
}

const fileBlob = new Blob([uintArray], { type: 'application/pdf' });

saveAs(fileBlob, 'filename.pdf');

It seems that only doing a base64 decode is not enough...you need to put the result into a Uint8Array. Otherwise, the pdf pages appear blank.

I found this solution here: https://github.com/sayanee/angularjs-pdf/issues/110#issuecomment-579988190

score 1 · Answer 2 · edited May 23 '17 at 10:28

1

You can use btoa() and atob() work in some browsers: For Exa.

var enc = btoa("this is some text");
alert(enc);
alert(atob(enc));

To JSON and base64 are completely independent.

Here's a JSON stringifier/parser (and direct GitHub link).

Here's a base64 Q&A. Here's another one.

edited May 23 '17 at 10:28

Community

1
1

answered Jul 20 '15 at 10:37

Bhavin Solanki

4,740
3
26
46

The base64 encoding is done on the server because there is some processing involved by converting from HTML to PDF. I know that btoa() and atob() work as counterparts in javascript. And as explained above everything works perfectly if strings are en- and decoded. But it does not work if is "binary" like content, in our case the content of a pdf file. – Marcel Härle Jul 20 '15 at 11:19
I think this might help you : https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding – hugsbrugs Mar 20 '16 at 17:04

Decode Base64 encoded PDF content in browser

2 Answers2

Linked