1

I'm trying to create a relative link in pdf file created with iTextSharp

everything works good with ASCII letters, but if I add other unicode symbols in path they are just skipped

This works fine:

Chunk chunk = new Chunk(text, font);
chunk.SetAnchor("./Attachments/1.jpg");

This creates incorrect link (link is created like this: //1.jpg, Вложения - part is missing):

Chunk chunk = new Chunk(text, font);
chunk.SetAnchor("./Вложения/1.jpg");

Is there any way to create correct link with unicode symbols? Thanks

Olja Muravjova
  • 117
  • 1
  • 7
  • Do you have that correct font installed on the machine performing the pdf conversion? It sounds like you are missing the needed font. – Ross Bush Nov 12 '18 at 14:49
  • @RossBush I don't understand what do you mean by font installed? These kind of paths are supported in file managers on my computer – Olja Muravjova Nov 12 '18 at 15:03
  • Some fonts like WingDings2 may or may not be installed on the server that is rendering the text. If there is a missing font you will oftentimes see odd squares or other anomalies. What OS are you using to render the PDF? – Ross Bush Nov 12 '18 at 17:10
  • @RossBush I use windows. But I thought that all these rendering problems appear only for pdf text. And not for links that are not even displayed. Also these unicode symbols are cyrillic symbols and i have cyrillic fonts installed on my computer – Olja Muravjova Nov 12 '18 at 17:42
  • Muravjova - What is the font value being passed into new Chunk() – Ross Bush Nov 12 '18 at 18:39
  • 1
    @RossBush I used Arial there, however, this font is used to write "text" parameter in pdf and there is no problem with that, the problem is with the link "behind" this text – Olja Muravjova Nov 12 '18 at 21:44
  • 1
    I am sorry Ross, but you are on a wild goose chase. The font has nothing to do with it. I am more thinking along the lines of the pdf specification being difficult about non-ascii characters in link destinations. I don't know, I'd have to look it up in the pdf reference we have at the office and I am sure that my colleagues wouldn't even have to look it up, but I am quite convinced that a link destination does not have a font. – Amedee Van Gasse Nov 13 '18 at 05:33

1 Answers1

1

By using Chunk.SetAnchor in iText 5 you effectively generate an URI Action. The URI parameter thereof is specified as

URI ASCII string (Required) The uniform resource identifier to resolve, encoded in 7-bit ASCII.

(ISO 32000-1, Table 206 – Additional entries specific to a URI action)

Thus, it can be considered ok that non-ASCII characters like your Cyrillic ones are not accepted by Chunk.SetAnchor. (It is not ok, though, that they are simply dropped; if the method does not accept its input, it should throw an exception.)

But by no means does that mean you cannot reference a file in a path that is using some non-ASCII characters. Instead you can make use of the fact that the path is considered an URI: This in particular means that you can apply the URL encoding scheme for special characters!

Thus, simply replace

chunk.SetAnchor("./Вложения/1.jpg");

by

chunk.SetAnchor(WebUtility.UrlEncode("./Вложения/1.jpg"));

and your link works again! (At least it did in my tests.)


PS: In .Net you actually have quite a choice of classes to do the URL encoding, cf. for example this answer. WebUtility.UrlEncode worked for me in the case at hand but depending on your use case one of the others might be more appropriate.


PPS: The situation changes a bit in the newer PDF specification:

URI ASCII string (Required) The uniform resource identifier to resolve, encoded in UTF8.

(ISO 32000-2, Table 210 — Additional entries specific to a URI action)

(I think the "ASCII" in the type column is a specification error and the UTF8 in the value column is to be taken seriously.)

But iText 5 has no PDF 2.0 support and, therefore, does not support UTF8 encoding here. One should probably test with iText 7 which claims PDF 2.0 support...

mkl
  • 90,588
  • 15
  • 125
  • 265