0

I'm using html2pdf (itself using TCPDF) to convert a HTML table element into a PDF file. This table has dynamic content, meaning that its number of columns can vary (sometimes 3, sometimes 11 columns, etc...).

One column contains an number that can be really long (ex: BF8545498134587). My problem is that when there's a lot of columns, this number exceed the column width. I'd like to wrap it (like using wrap-word in HTML/CSS). But html2pdf doesn't support these CSS properties.

I found a solution (there : Html2pdf doesn't support word-break:break-all css ) that consists of inserting a zero width space (​in HTML, \xE2\x80\x8B in UTF-8 ) between each characters. It works perfectly fine in HTML, but in the PDF document, this zero width space is replaced by '?'.

I tried to change the font-family in the TCPDF class (having found this : http://www.fileformat.info/info/unicode/char/200b/fontsupport.htm to know which font to use with this HTML entity) but nothing's changed...

I saw other answers telling to use TCPDF functions like writeHTMLCell() or using MultiCell() rather than Cell() but I can't apply these solutions for several reasons that I'm not allowed to tell here.

So I don't know where to look now.

Thanks in advance for any help.

Community
  • 1
  • 1

3 Answers3

0

Nicolas, you must check that you are saving your file without using BOM. For example, in Notepad++ for windows or in Geany for Linux, you have such option easily set.

Make sure to keep your file encoding as UTF-8.

The UTF-8 characters will be processed according to your HTML2PDF settings:

$html2pdf = new HTML2PDF('P', 'A4', 'it', true, 'UTF-8', array(0, 0, 0, 0)

EDIT 1:

The way you use spaces ' ' - is not a very good practice and I would expect that the problem comes from here. Try replacing it with &nbsp or %20. I would go with the first one.

Explanations:

 

  is an HTML character reference which actually refers to character 160 of Unicode (and also ISO-8859-1 aka Latin-1). It's a different space character entirely -- the "non-breaking space". Even though they look pretty much the same, they're different characters and it's unlikely that your server will treat them the same way.

%20

%20 is the URL escaping for byte 32, which corresponds to plain old space in pretty much any encoding you're likely to use in a URL.

EDIT 2:

Probably using margins would help:

$html2pdf->SetMargins(20,18);

EDIT 3:

Found on some Adobe forums, that they recommend using for the sapce:

 

Explanations:

  is the character entity reference.   is the numeric entity reference. They are the same except for the fact that the latter does not need another lookup table to find its actual value.

Community
  • 1
  • 1
Ilia Ross
  • 13,086
  • 11
  • 53
  • 88
  • Thanks for this answer. I create the HTML table element in PHP (so the table is a PHP string) in a file encoded in UTF-8 (I work with Coda 2 on Mac OSX and I think it's always without BOM (read that somewhere)). Tried to change the file encoding, still no success. Here's where I insert the ZWSP : http://viper-7.com/0HcgL3 – Nicolas Ferrari Jun 26 '14 at 08:19
  • For your EDIT 1: thanks for the tip, it enhanced some things in my PDF. But the "?" are still there... I don't understand how ` ` works (even its equivalent ` `) but not `​`... – Nicolas Ferrari Jun 26 '14 at 08:27
  • ` ` - unbreakable space. If you put `___` you will not get 3 space on HTML page but only one. If you put `   ` - in this case you will. Just use ` ` all over, instead of a plain space. It should fix your problem. – Ilia Ross Jun 26 '14 at 08:33
  • Try using ` ` instead. – Ilia Ross Jun 26 '14 at 08:42
  • Thanks for your help, but like you said, ` ` is an unbreakable space, and what I want to insert is a "zero width space" (http://en.wikipedia.org/wiki/Zero-width_space ) so that the words which are too long for the table cell in the PDF document will be cut to fill correctly the cell and avoid the text to go out of it... What I don't understand, is why the html entity `​` is replaced by "?" in my PDF document, despite all my encoding tests... – Nicolas Ferrari Jun 26 '14 at 08:43
  • `Zero-width space (ZWSP) is a non-printing character used in computerized typesetting` - and your are trying to print it, right? :) Try using ` ` instead. – Ilia Ross Jun 26 '14 at 09:03
  • Ah yeah sorry for that, didn't compute in my stupid brain... But it doesn't resolve my problem. I'd like to wrap my word correctly in the table cell so I need the ZWSP. With ` `, I have "w o r d" instead of "word" (which is logical as ` `is a plain space). Thanks for your time. – Nicolas Ferrari Jun 26 '14 at 09:50
0

Solved this several days ago : found and used "mpdf". This one is awesome :)

http://www.mpdf1.com/mpdf/index.php

-1

You should use <wbr> instead of a zero width space in your long number to allow line breaks within words in html2pdf.

  • This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post - you can always comment on your own posts, and once you have sufficient [reputation](http://stackoverflow.com/help/whats-reputation) you will be able to [comment on any post](http://stackoverflow.com/help/privileges/comment). - [From Review](/review/low-quality-posts/10663935) – Gerald Versluis Dec 23 '15 at 12:29
  • Hi @GeraldVersluis, I changed the phrasing of the answer to be more solution oriented, is the answer more acceptable now? Thank you for your feedback. – Fabien Schiettecatte Jan 14 '16 at 15:01
  • It generates HTTP 500 when put inside a text within a td – ZalemCitizen May 07 '20 at 15:33