83

I'm trying to render a page from a pdf with pdf.js

Normally, using a url, I can do this:

PDFJS.getDocument("http://www.server.com/file.pdf").then(function getPdfHelloWorld(pdf) {
  //
  // Fetch the first page
  //
  pdf.getPage(1).then(function getPageHelloWorld(page) {
    var scale = 1.5;
    var viewport = page.getViewport(scale);

    //
    // Prepare canvas using PDF page dimensions
    //
    var canvas = document.getElementById('the-canvas');
    var context = canvas.getContext('2d');
    canvas.height = viewport.height;
    canvas.width = viewport.width;

    //
    // Render PDF page into canvas context
    //
    page.render({canvasContext: context, viewport: viewport});
  });
});

But in this case, I have the file in base64 rather than an url:

data:application/pdf;base64,JVBERi0xLjUKJdDUxdgKNSAwIG9iaiA8PAovTGVuZ3RoIDE2NjUgICAgICAKL0ZpbHRlciAvRmxhdGVEZWNvZGUKPj4Kc3RyZWFtCnjarVhLc9s2...

How this can be done?

Rob W
  • 341,306
  • 83
  • 791
  • 678
Alex
  • 2,036
  • 4
  • 22
  • 31

3 Answers3

113

from the sourcecode at http://mozilla.github.com/pdf.js/build/pdf.js

/**
 * This is the main entry point for loading a PDF and interacting with it.
 * NOTE: If a URL is used to fetch the PDF data a standard XMLHttpRequest(XHR)
 * is used, which means it must follow the same origin rules that any XHR does
 * e.g. No cross domain requests without CORS.
 *
 * @param {string|TypedAray|object} source Can be an url to where a PDF is
 * located, a typed array (Uint8Array) already populated with data or
 * and parameter object with the following possible fields:
 *  - url   - The URL of the PDF.
 *  - data  - A typed array with PDF data.
 *  - httpHeaders - Basic authentication headers.
 *  - password - For decrypting password-protected PDFs.
 *
 * @return {Promise} A promise that is resolved with {PDFDocumentProxy} object.
 */

So a standard XMLHttpRequest(XHR) is used for retrieving the document. The Problem with this is that XMLHttpRequests do not support data: uris (eg. data:application/pdf;base64,JVBERi0xLjUK...).

But there is the possibility of passing a typed Javascript Array to the function. The only thing you need to do is to convert the base64 string to a Uint8Array. You can use this function found at https://gist.github.com/1032746

var BASE64_MARKER = ';base64,';

function convertDataURIToBinary(dataURI) {
  var base64Index = dataURI.indexOf(BASE64_MARKER) + BASE64_MARKER.length;
  var base64 = dataURI.substring(base64Index);
  var raw = window.atob(base64);
  var rawLength = raw.length;
  var array = new Uint8Array(new ArrayBuffer(rawLength));

  for(var i = 0; i < rawLength; i++) {
    array[i] = raw.charCodeAt(i);
  }
  return array;
}

tl;dr

var pdfAsDataUri = "data:application/pdf;base64,JVBERi0xLjUK..."; // shortened
var pdfAsArray = convertDataURIToBinary(pdfAsDataUri);
PDFJS.getDocument(pdfAsArray)
Irvin Dominin
  • 30,819
  • 9
  • 77
  • 111
Codetoffel
  • 2,743
  • 2
  • 21
  • 14
  • 1
    so will it be possible to fetch the binary of pdf and show it in the pdf viewer using pdf.js – Dakait May 16 '13 at 11:11
  • 1
    Nice work. But what if the source is PDF retrieved via a RESTful call into an arraybuffer or blob? I posted a question on it here: http://stackoverflow.com/questions/24288221/pdf-js-render-pdf-using-an-arraybuffer-or-blob-instead-of-url – witttness Jun 18 '14 at 15:29
  • 2
    Don't forget the `var` before `i = 0` if you are in `strict mode`! I lost 1 hour for nothing. :) – kube Apr 14 '15 at 01:45
  • I got this to work. My answer is [here](http://stackoverflow.com/questions/24535799/pdf-js-and-viewer-js-pass-a-stream-or-blob-to-the-viewer/32634740#32634740) – toddmo Sep 17 '15 at 16:21
  • I like arrays :-) `var base64 = dataURI.split(BASE64_MARKER)[1];` – ddlab Dec 22 '17 at 15:32
5

According to the examples base64 encoding is directly supported, although I've not tested it myself. Take your base64 string (derived from a file or loaded with any other method, POST/GET, websockets etc), turn it to a binary with atob, and then parse this to getDocument on the PDFJS API likePDFJS.getDocument({data: base64PdfData}); Codetoffel answer does work just fine for me though.

Uzer
  • 3,054
  • 1
  • 17
  • 23
  • 1
    I have tested it using the nodejs package with `PDFJS.getDocument({data: Buffer.from(pdf_base64, 'base64')})` – efeder Dec 07 '18 at 17:16
0

Used the Accepted Answer to do a check for IE and convert the dataURI to UInt8Array; an accepted form by PDFJS

        Ext.isIE ? pdfAsDataUri = me.convertDataURIToBinary(pdfAsDataUri): '';

        convertDataURIToBinary: function(dataURI) {
          var BASE64_MARKER = ';base64,',
            base64Index = dataURI.indexOf(BASE64_MARKER) + BASE64_MARKER.length,
            base64 = dataURI.substring(base64Index),
            raw = window.atob(base64),
            rawLength = raw.length,
            array = new Uint8Array(new ArrayBuffer(rawLength));

          for (var i = 0; i < rawLength; i++) {
            array[i] = raw.charCodeAt(i);
          }
          return array;
        },
Akin Okegbile
  • 1,108
  • 19
  • 36