9

I am using AWS Lambda for generating pdf where html-pdf is the npm package. everything works flawless but the problem is with Hindi character. the characters appear gibberish and understandable something like as shown in attached image.enter image description here

Packages used

  1. html-pdf
  2. ejs

Things i tried:

i used same nodejs based code on my local machine and it was working as expected. but not working on Lambda (nodejs v6.10/8.10)

Deepak Mallah
  • 3,926
  • 4
  • 21
  • 26
  • 2
    How are you outputting the PDF? Upload to S3? return directly from lambda? How is the lambda invoked? Directly? Via apigateway? other Event? – cementblocks Nov 14 '18 at 18:02
  • Is the string you are using for the file name returned from a lambda function? Try parsing it as JSON e.g. `name = JSON.parse(filename)` before using it. – bwest Nov 14 '18 at 18:10
  • @cementblocks i am creating a stream using ejs and directly uploading the stream to S3 – Deepak Mallah Nov 15 '18 at 09:53
  • @DeepakMallah square blocks typically appear when you use a *font* that doesn't include the character you want. – Panagiotis Kanavos Nov 20 '18 at 08:40
  • @PanagiotisKanavos i am not using any font – Deepak Mallah Nov 20 '18 at 08:42
  • 1
    @DeepakMallah you are. Otherwise you wouldn't see *any* text. If you see text, you are using a font even if it's a default one – Panagiotis Kanavos Nov 20 '18 at 08:42
  • @DeepakMallah Another possibility is that you are *not* using Unicode text (UTF16 or UTF8) but a single-byte codepage. If you try to load that text using the *wrong* codepage, you may end up with gibberish, question marks or square blocks for any bytes that have no matching character in the new codepage – Panagiotis Kanavos Nov 20 '18 at 08:44
  • @DeepakMallah you may have ISCII text for example. If your code doesn't specifry the codepage when loading the text, the system's locale will be used. This means your machine can display the text just fine but any cloud VM or Lambda server will try to load the text as UTF8 or Latin 1, resulting in errors – Panagiotis Kanavos Nov 20 '18 at 08:47

1 Answers1

1

By default, Lambda does not output in binary. It base64 encodes all of your output.

If your PDF is served back via the API Gateway, you can reverse this with a change in the API Gateway in order to get binary with the following steps below the first image:

enter image description here

  1. Go to the corresponding API Gateway for your Lambda function (mine was Generate Calendar)
  2. Select Settings
  3. In the Binary Media Types enter */*
  4. Click the blue Save Changes button.

Then re-deploy the API. See following image and steps:

enter image description here

  1. Click Resources.
  2. Under the Action button, select Deploy API
  3. Under deployment stage, select Prod
  4. Then click blue Deploy button,

Here is a AWS forum post with a similar PDF problem to yours. Hope this helps.

Taterhead
  • 5,763
  • 4
  • 31
  • 40