0

I am adding page numbers to a pdf file,

It works correctly with english, but when I try to add hebrew text it ommits those letters.

I assume the problem is with the encoding to base64, how do I solve this?

Code Example

byte[] myBinary = File.ReadAllBytes(path);
using (var reader = new PdfReader(myBinary))
{
    using (var ms = new MemoryStream())
    {
        using (var stamper = new PdfStamper(reader, ms))
        {
            int PageCount = reader.NumberOfPages;
            for (int i = 1; i <= PageCount; i++)
            {
                ColumnText.ShowTextAligned(stamper.GetUnderContent(i),
        Element.ALIGN_CENTER, new Phrase(String.Format("{0} מתוך {1}", i, PageCount)), 297f, 15f, 0);
            }
        }
        myBinary = ms.ToArray();
    }
}
string base64EncodedPDF = System.Convert.ToBase64String(myBinary);
return base64EncodedPDF;

In the front all I do is download the file.

 $scope.open_letter = function (letter) {
   var _letter = myService.PrintLetter().then(function (data) {
       var pdfAsDataUri = "data:application/pdf;base64," + data.data;
       var a = document.createElement("a");
       a.href = pdfAsDataUri;
       a.download = "מכתב" + ".pdf";
       a.click();
});
}

The reason I am asking this question is because in English it works perfectly, but it just ommits the Hebrew letters, which is interesting- I would assume it would replace it with weird characters.

BatshevaRich
  • 550
  • 7
  • 19
  • 1
    what characters are being removed? base-64 is a binary operation - it doesn't care about characters; can you be more specific about what you're doing and why? – Marc Gravell Jul 15 '20 at 11:23
  • please add the code you use for decoding, the problem is very unlikely ToBase64String – Patrick Beynio Jul 15 '20 at 11:37
  • I think the anchor points may be wrong since hebrew is right to left. Try changing the 297f. See : https://stackoverflow.com/questions/35280015/itextsharp-showtextaligned-anchor-point – jdweng Jul 15 '20 at 12:29
  • You may be missing a font. Did a search for same issue using Turkish (is also right to left) : https://stackoverflow.com/questions/50086417/created-pdf-file-is-missing-characters-in-turkish-language – jdweng Jul 15 '20 at 12:38
  • Have you tried writing the modified PDF to a file-stream and checked that it works as intended? This would help isolate the problem to either the PdfStamper or the encoding. – JonasH Jul 15 '20 at 12:47
  • @MarcGravell it is removing the letters 'מתוך' – BatshevaRich Jul 15 '20 at 14:29
  • @PatrickBeynio added – BatshevaRich Jul 15 '20 at 14:30
  • @jdweng that is for the styling, to center the page number – BatshevaRich Jul 15 '20 at 14:30
  • What font is the styling using? Is the Hebrew letter inside the view (or outside the margins)? Is the Hebrew text behind a different object? – jdweng Jul 15 '20 at 14:36
  • @jdweng Correction: Turkish is left-to-right – Klaus Gütter Jul 15 '20 at 19:13

2 Answers2

0

So I finally managed to solve this issue.

The problem was not that I was missing a font, but I wasn't sending one at all to the new Phrase function.

I guess it knows what to do with the english letters, but not with the hebrew ones.

What I did was this:

BaseFont bf = BaseFont.CreateFont("c:/windows/Fonts/GISHA.ttf", BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);
Font f= new Font(bf, 8, Font.NORMAL, BaseColor.BLACK);

and then in my loop for page numbers, I did this:

int PageCount = reader.NumberOfPages;
for (int i = 1; i <= PageCount; i++)
{
   ColumnText.ShowTextAligned(stamper.GetUnderContent(i),
   Element.ALIGN_CENTER, new Phrase(String.Format("{1} ךותמ {0}", i, PageCount), f), 297f, 15f, 0);
 }

Which solved my issue, and now it works beautifully.

proof

BatshevaRich
  • 550
  • 7
  • 19
0

An advice for you : Don't use MemoryStream!

Use RecyclableMemoryStream, if you want to avoid OutOfMemory issues due to memory fragmentation.

https://www.philosophicalgeek.com/2015/02/06/announcing-microsoft-io-recycablememorystream/

Memorystream and Large Object Heap