16

I am using pdfkit on my node server, typically creating pdf files, and then uploading them to s3.

The problem is that pdfkit examples pipe the pdf doc into a node write stream, which writes the file to the disk, I followed the example and worked correctly, however my requirement now is to pipe the pdf doc to a memory stream rather than save it on the disk (I am uploading to s3 anyway).

I've followed some node memory streams procedures but none of them seem to work with pdf pipe with me, I could just write strings to memory streams.

So my question is: How to pipe the pdf kit output to a memory stream (or something alike) and then read it as an object to upload to s3?

var fsStream = fs.createWriteStream(outputPath + fileName); 
doc.pipe(fsStream);
Dada
  • 6,313
  • 7
  • 24
  • 43
Mahmoud Ezzat
  • 253
  • 1
  • 2
  • 9
  • Thanks to this answer from @bolav https://stackoverflow.com/a/35661202/7287324 I wrote this gist for a Node environment to generate a PDF with Charts (ChartJS charts). https://gist.github.com/ChemaCLi/006b2d0615cd617ff88900ba119189f8 I needed to handle the PDF as a temporal file. – Chemah Nov 04 '21 at 00:17

6 Answers6

18

An updated answer for 2020. There is no need to introduce a new memory stream because "PDFDocument instances are readable Node streams".

You can use the get-stream package to make it easy to wait for the document to finish before passing the result back to your caller. https://www.npmjs.com/package/get-stream

const PDFDocument = require('pdfkit')
const getStream = require('get-stream')

const pdf = () => {
  const doc = new PDFDocument()
  doc.text('Hello, World!')
  doc.end()
  return await getStream.buffer(doc)
}


// Caller could do this:
const pdfBuffer = await pdf()
const pdfBase64string = pdfBuffer.toString('base64')

You don't have to return a buffer if your needs are different. The get-stream readme offers other examples.

TroyWolf
  • 346
  • 2
  • 5
4

There's no need to use an intermediate memory stream1 – just pipe the pdfkit output stream directly into a HTTP upload stream.

In my experience, the AWS SDK is garbage when it comes to working with streams, so I usually use request.

var upload = request({
    method: 'PUT',
    url: 'https://bucket.s3.amazonaws.com/doc.pdf',
    aws: { bucket: 'bucket', key: ..., secret: ... }
});

doc.pipe(upload);

1 - in fact, it is usually undesirable to use a memory stream because that means buffering the entire thing in RAM, which is exactly what streams are supposed to avoid!

josh3736
  • 139,160
  • 33
  • 216
  • 263
  • I tried this, but I don't seem to have experience with ending pipes or ending the usage of `request` so I ended up having this error: `[grunt-develop] > events.js:141 throw er; // Unhandled 'error' event ^ Error: read ECONNRESET at exports._errnoException (util.js:856:11) at TLSWrap.onread (net.js:544:26) >> application exited with code 1` – Mahmoud Ezzat Feb 27 '16 at 00:13
  • When a readable stream (here, `doc`) ends, it will automatically tell the writable stream it is piped to (`upload`) that it has ended. If you want to know when the writable (upload) is done, listen for the [`finish` event](https://nodejs.org/api/stream.html#stream_event_finish). – josh3736 Feb 27 '16 at 00:16
  • And that stack trace indicates you're not listening for `error` events on something which is getting disconnected. Add `error` listeners so you can determine exactly what is getting disconnected. If it's your `upload` request, that's quite strange. – josh3736 Feb 27 '16 at 00:19
  • I've added an error catcher: `upload.on('error', function (e) { console.log('error', e); });` which logs `error { [Error: read ECONNRESET] code: 'ECONNRESET', errno: 'ECONNRESET', syscall: 'read' }` – Mahmoud Ezzat Feb 27 '16 at 00:26
2

You could try something like this, and upload it to S3 inside the end event.

var doc = new pdfkit();

var MemoryStream = require('memorystream');
var memStream = new MemoryStream(null, {
   readable : false
});

doc.pipe(memStream);

doc.on('end', function () {
   var buffer = Buffer.concat(memStream.queue);
   awsservice.putS3Object(buffer, fileName, fileType, folder).then(function () { }, reject);
})
bolav
  • 6,938
  • 2
  • 18
  • 42
  • Hi bolav, thanks for your suggestion, however my problem is that I don't want to use`fs.createWriteStream` on the server. – Mahmoud Ezzat Feb 26 '16 at 20:45
  • Yes. I thought you understood that you were able to replace what I did to prove that I could write it to a file with something that uploaded to S3. How do you upload your files to S3? – bolav Feb 26 '16 at 21:23
  • `awsservice.putS3Object(objectToUpload, fileName, fileType, folder).then(function () { }, reject);` – Mahmoud Ezzat Feb 26 '16 at 23:00
  • I can't insert the implementation of `putS3Object` here because too little characters are allowed as comments. – Mahmoud Ezzat Feb 26 '16 at 23:05
  • Hey thanks! It worked for me. I needed to make the PDF a temporal file and then use the file path. – Chemah Nov 04 '21 at 00:14
0

A tweak of @bolav's answer worked for me trying to work with pdfmake and not pdfkit. First you need to have memorystream added to your project using npm or yarn.

const MemoryStream = require('memorystream');
const PdfPrinter = require('pdfmake');
const pdfPrinter = new PdfPrinter();
const docDef = {};
const pdfDoc = pdfPrinter.createPdfKitDocument(docDef);
const memStream = new MemoryStream(null, {readable: false});
const pdfDocStream = pdfDoc.pipe(memStream);
pdfDoc.end();
pdfDocStream.on('finish', () => {
  console.log(Buffer.concat(memStream.queue);
});
0

My code to return a base64 for pdfkit:

import * as PDFDocument from 'pdfkit'
import getStream from 'get-stream'

const pdf = {
  createPdf: async (text: string) => {
    const doc = new PDFDocument()
    doc.fontSize(10).text(text, 50, 50)
    doc.end()

    const data = await getStream.buffer(doc)
    let b64 = Buffer.from(data).toString('base64')
    return b64
  }
}

export default pdf
Alan
  • 9,167
  • 4
  • 52
  • 70
0

Thanks to Troy's answer, mine worked with get-stream as well. The difference was I did not convert it to base64string, but rather uploaded it to AWS S3 as a buffer.

Here is my code:

import PDFDocument from 'pdfkit'
import getStream from 'get-stream';
import s3Client from 'your s3 config file';

const pdfGenerator = () => {
  const doc = new PDFDocument();
  doc.text('Hello, World!');
  doc.end();
  return doc;
}

const uploadFile = async () => {
  const pdf = pdfGenerator();
  const pdfBuffer = await getStream.buffer(pdf)

  await s3Client.send(
    new PutObjectCommand({
      Bucket: 'bucket-name',
      Key: 'filename.pdf',
      Body: pdfBuffer,
      ContentType: 'application/pdf',
    })
  );
}

uploadFile()