15

I'm trying to implement a PDF file download functionality with JavaScript.
As a response to a POST request I get a PDF file, in Chrome DevTools console it looks like (the oResult data container, fragment):

"%PDF-1.4↵%����↵4 0 obj↵<</Filter/FlateDecode/Length 986>>stream↵x��

Now I'm trying to initialize the download process:

let blob = new Blob([oResult], {type: "application/pdf"});

let link = document.createElement('a');

link.href = window.URL.createObjectURL(blob);
link.download = "tstPDF";

link.click();

As a result, upon a click on a button I get tstPDF.pdf, it contains the correct number of pages, but the PDF itself is empty, no content is displayed, although it is 6 KB.

When I test the Java server-side module, which generates the PDF, everything is working fine, it sends InputStream through ServletOutputStream. Thus I assume that the issue is somewhere on a client side, perhaps something with MIME, BLOB, encoding, or similar.

Why doesn't the generated PDF display any data?

Mike
  • 14,010
  • 29
  • 101
  • 161
  • 8
    You are experiencing "byte shaving". Bytes consist of 8 bits. For ASCII, only 7 bytes are needed. PDFs are binary files, they need every bit, but some processes treat PDFs as if it were text files, and they remove the highest bit. When that happens, the structure of the PDF consisting of ASCII characters is preserved (which is why it opens in a viewer), but the binary content of the pages is corrupted (which is why the pages are blank). This is not an iText problem. This is caused by a process that corrupts the bytes in the binary file. – Bruno Lowagie Aug 23 '18 at 08:48
  • 1
    The presence of � characters in the header proves that the bytes were corrupted. Those four bytes should be four different "characters" with a value higher than `01111111` (or `127` or `7F`). – Bruno Lowagie Aug 23 '18 at 08:51
  • 1
    By the way: I think you are using an old version of iText because I see that you're creating PDF 1.4 files. iText 7.0.x creates PDF 1.7 files; iText 7.1.x creates PDF 2.0 files. – Bruno Lowagie Aug 23 '18 at 08:53
  • @BrunoLowagie, thanks for the prompt response and clarification. When I consume web-service directly via `FORM action="./PdfServlet" method="post" accept-charset="UTF-8" enctype="application/x-www-form-urlencoded;charset=UTF-8"` everything is OK, but when I use it via jQuery `POST` I face byte shaving issue. Could you, please, advise how to avoid such behaviour? Thanks. P.S. Yes, I'm using iText 5.x, which is a part of the legacy. – Mike Aug 23 '18 at 09:01
  • I can answer iText-related questions; jQuery POST questions are out of my league. – Bruno Lowagie Aug 23 '18 at 09:07
  • @BrunoLowagie, thanks for the assistance. I solved my issue with a help of your post in other thread. – Mike Aug 24 '18 at 07:59
  • This is the first SO post that appears when looking for that PDF data format. With the lead of @BrunoLowagie's comment, I [found this SO answer](https://stackoverflow.com/questions/60454048/how-does-axios-handle-blob-vs-arraybuffer-as-responsetype/60461828#60461828) which beautifully explains that you're maybe "expecting" the wrong, default, response type. Any attempt to format, encoding or converting this data won't help as important data is lost due to the erroneous response type, which may be `JSON` or a text-string. What we need here is either an `ArrayBuffer` or a `Stream`. – Advena Jan 09 '23 at 11:59

2 Answers2

20

I solved the issue. The problem was in a way the data is delivered from the server to the client. It is critical to assure that the server sends the data in Base64 encoding, otherwise the client side can't deserialize the PDF string back to the binary format. Below, you can find the full solution.

Server-side:

OutputStream pdfStream = PDFGenerator.pdfGenerate(data);

String pdfFileName = "test_pdf";

// represent PDF as byteArray for further serialization
byte[] byteArray = ((java.io.ByteArrayOutputStream) pdfStream).toByteArray();

// serialize PDF to Base64
byte[] encodedBytes = java.util.Base64.getEncoder().encode(byteArray);

response.reset();
response.addHeader("Pragma", "public");
response.addHeader("Cache-Control", "max-age=0");
response.setHeader("Content-disposition", "attachment;filename=" + pdfFileName);
response.setContentType("application/pdf");

// avoid "byte shaving" by specifying precise length of transferred data
response.setContentLength(encodedBytes.length);

// send to output stream
ServletOutputStream servletOutputStream = response.getOutputStream();

servletOutputStream.write(encodedBytes);
servletOutputStream.flush();
servletOutputStream.close();

Client side:

let binaryString = window.atob(data);

let binaryLen = binaryString.length;

let bytes = new Uint8Array(binaryLen);

for (let i = 0; i < binaryLen; i++) {
    let ascii = binaryString.charCodeAt(i);
    bytes[i] = ascii;
}

let blob = new Blob([bytes], {type: "application/pdf"});

let link = document.createElement('a');

link.href = window.URL.createObjectURL(blob);
link.download = pdfFileName;

link.click();

Reference topics:

Mike
  • 14,010
  • 29
  • 101
  • 161
  • 2
    Great answer! This will be useful for further reference in case I see a question of someone who experiences the same problem. – Bruno Lowagie Aug 24 '18 at 09:52
  • 3
    Although it works, I would not use this when pdf files can get big. This is problematic both on server, where toByteArray makes a copy of data in memory, then encode makes another copy, ~1.4 times bigger. On client side is also bad: you have once data, then binaryString, then bytes, then blob. Maybe you need this: https://stackoverflow.com/a/16711825 – Marius Ologesa May 22 '20 at 12:00
  • @MariusOlogesa, the proposed link contains only client-side part, what about server side? Is there any better way to avoid _«`toByteArray` makes a copy of data in memory, then `encode` makes another copy, ~1.4 times bigger»_? – Mike May 22 '20 at 13:48
  • 1
    After try a lot of solutions, it works for me. Thanks a lot ! – ASK Aug 15 '20 at 15:08
  • DOMException: Failed to execute 'atob' on 'Window': The string to be decoded contains characters outside of the Latin1 range – Sveen Jan 02 '23 at 15:19
  • 1
    also atob is deprecated – Sveen Jan 02 '23 at 15:19
  • @Sveen, according to https://stackoverflow.com/a/71330832, _«`btoa` and `atob` are only deprecated for Node.js»_, on the **client-side** everything should be still **valid**. On https://developer.mozilla.org/en-US/docs/Web/API/atob there is also no any mention of deprecation. How did you get _«`DOMException`: Failed to execute `atob` on `Window`»?_ Please, provide a reproduction code sample on https://jsbin.com – Mike Jan 03 '23 at 07:54
0

Thanks to this. It really works.

BTW, here's how I do it using spring controller and ajax with pdf generated by jasper

The Controller:

public ResponseEntity<?> printPreview(@ModelAttribute("claim") Claim claim)
{
    try
    {
        //Code to get the byte[] from jasper report.
        ReportSource source = new ReportSource(claim);
        byte[] report = reportingService.exportToByteArrayOutputStream(source);

        //Conversion of bytes to Base64
        byte[] encodedBytes = java.util.Base64.getEncoder().encode(report);

        //Setting Headers
        HttpHeaders headers = new HttpHeaders();
        headers.setContentType(MediaType.parseMediaType("application/pdf"));
        headers.setContentDispositionFormData("pdfFileName.pdf", "pdfFileName.pdf");
        headers.setCacheControl("must-revalidate, post-check=0, pre-check=0");
        headers.setContentLength(encodedBytes.length);

        return new ResponseEntity<>(encodedBytes, headers, HttpStatus.OK);
    }
    catch (Exception e)
    {
        LOG.error("Error on generating report", e);
        return new ResponseEntity<>(null, HttpStatus.INTERNAL_SERVER_ERROR);
    }
 }

The ajax:

    $.ajax({
       type: "POST",
       url: "",
       data: form.serialize(), //Data from my form
       success: function(response)
       {
                let binaryString = window.atob(response);
                let binaryLen = binaryString.length;
                let bytes = new Uint8Array(binaryLen);

                for (let i = 0; i < binaryLen; i++) {
                    let ascii = binaryString.charCodeAt(i);
                    bytes[i] = ascii;
                }

                let blob = new Blob([bytes], {type: "application/pdf"});
                let link = URL.createObjectURL(blob);
                window.open(link, '_blank');
       },
       error: function()
       {

       }
     });

This will load the pdf in new window.

References: Return generated pdf using spring MVC

jeloVilla
  • 9
  • 1