47

I am working in a project using ruby on rails(3.1). My requirement is to produce pdf from the html content. So I use pdfkit gem.

In some pages, characters in single line partially cut between pages. When I convert html convert to pdf using pdfkit gem

version of wkhtmltopdf: wkhtmltopdf -- 0.11.0 rc1

operating system: Linux CentOS 5.5

In the image below showing character partially cut between pages.

Please suggest a solution.

Example 1

enter image description here

Example 2

enter image description here

amexn
  • 2,158
  • 6
  • 33
  • 56

11 Answers11

17

I did have this problem with a table:

enter image description here

Then I added this to my CSS:

table, img, blockquote {page-break-inside: avoid;}

This fixed the problem:

enter image description here

Besi
  • 22,579
  • 24
  • 131
  • 223
  • This didn't work for me. I tried settings this attribute on enclosing td, tr and div and line is still cropped. – Amichai Schreiber Apr 20 '16 at 10:37
  • What i have found is that the basic rendering of the document respects `page-break-inside: avoid;`, but if you use the `--margin-bottom` option, then that **doesn't** respect that rule and starts splitting things mid line. – Max Williams Apr 13 '23 at 09:29
11

I just ran across this and found something that resolved the issue for me. In my particular case, there were divs with display: inline-block; margin-bottom: -20px;. Once I changed them to block and reset the margin-bottom, the line splitting disappeared. YMMV.

nvahalik
  • 529
  • 6
  • 18
  • Thank you, I had the same problem with an "article" element. After adding `display: block`, it worked like a charm. – nils Jan 31 '13 at 16:15
  • @nvahalik: To which element did you add display:block? i have a similar issue with exporting table in pdf. SO question here - http://stackoverflow.com/questions/17046385/wicked-pdf-rendering-the-last-row-across-two-pages – usha Jun 11 '13 at 16:29
9

According to some documentation I found (see Page Breaking), this is a known issue and suggests using CSS page breaks to insert page breaks (assuming you are using patched version of QT):

The current page breaking algorithm of WebKit leaves much to be desired. Basically webkit will render everything into one long page, and then cut it up into pages. This means that if you have two columns of text where one is vertically shifted by half a line. Then webkit will cut a line into to pieces display the top half on one page. And the bottom half on another page. It will also break image in two and so on. If you are using the patched version of QT you can use the CSS page-break-inside property to remedy this somewhat. There is no easy solution to this problem, until this is solved try organising your HTML documents such that it contains many lines on which pages can be cut cleanly.

See also: http://code.google.com/p/wkhtmltopdf/issues/detail?id=9, http://code.google.com/p/wkhtmltopdf/issues/detail?id=33 and http://code.google.com/p/wkhtmltopdf/issues/detail?id=57.

Peter Brown
  • 50,956
  • 18
  • 113
  • 146
  • 1
    This no longer the case. The answer below by @Besi resolves any page break issues, not to mention just getting the latest version of `wkhtmltopdf` (0.12.2.1). Add the following to your CSS: `table, img, blockquote {page-break-inside: avoid;}` – craned May 08 '15 at 16:14
  • 1
    @craned not right. The problem is only partially solved and its still there. The `page-break-inside` will only help for the whole block you are adding it to. For example, if 1 paragraph / block is more than a page long, then `page-break-inside` will not help and the text will be cut in some cases. Its okay to fix it if it is static text, but it is a problem with dynamically generated text when you dont know how long that particular block will be. So the problem is still there and only partially resolved. – Neel Nov 06 '16 at 11:20
  • @Neel, in that case I'd say it's mostly solved. At least in my particular scenario, 1 paragraph/block was never going to be a problem. Quite frankly, a paragraph/block should never be longer than a normal page, but in what seems like the rare case that it is, then yes, that would be a place where the problem still exists. – craned Nov 16 '16 at 03:10
  • Am using 0.12.5.0 (patched QT) and it is still breaking for me. – Krishna Prasad Varma Apr 28 '19 at 12:13
5

In my case, the issue was resolved by commenting out the following css:

html, body {
  overflow-x: hidden;
} 

In general, check if any tags have overflow set as hidden and remove it or set it to visible.

Btw, I am using wkhtmltopdf version 0.12.2.1 on Windows 8.

Pedro M Duarte
  • 26,823
  • 7
  • 44
  • 43
2

This is old but hopefully will help someone - I was having the issues too, tried everything - even resorting back to old versions mentioned (12.1) but to no avail. I kept tweaking css to play around, trying to throw in page-break avoids everywhere, not having much progress. Then I tweaked css that was on the root div of my html, and it fixed it. I made so many tweaks trying to get it to work so I can't be 100% sure, but I believe the issue was it set to 'display:table' with margin: 0 auto and a specific width on the main outer div. It started working and not cutting off either images or tables mid-row once I removed that. Then the page-break-inside: avoid was working after that as expected.

I believe ultimately the code is trying to guess as best as it can exactly how many pixels high each page is, and where exactly (down to the pixel) is your content. We have to make it easy for the library to detect this by removing as much odd css in there as possible, so it's as simple as possible to calculate down to the pixel where the content lies. That's my guess.

Rob
  • 479
  • 8
  • 16
2

https://github.com/ArthurHub/HTML-Renderer/issues/38

                    **var head = "<head><style type=\"text/css\"> td, h1, h2, h3, p, b, div, i, span, label, ul, li, tr, table { page-break-inside: avoid; } </style></head>";**

                    PdfDocument pdf = PdfGenerator.GeneratePdf("html>" + head + "<body>" +  m42Notes + "</body></html>", configurationOptions);
Ben Wong
  • 691
  • 2
  • 19
  • 29
1

I scoured the internet for a couple of weeks, trying to overcome this issue. None of the solutions I found worked for me, but something else did.

I had a two column layout where the text was getting cut off mid-text. In the broken state, my basic structure looked like this:

@media print {
  * {
    page-break-inside: avoid;
    page-break-after: avoid;
    page-break-before: avoid;
  }
}
.col-9{
  display: inline-block;
  width: 70%;
}
.col-9{
  display: inline-block;
  width: 25%;
}

<div class="col-9">
  [a lot of text here, that would spill over multiple pages]
</div>
<div class="col-3">
  [a short sidebar here]
</div>

I fixed it by changing it to:

@media print {
  * {
    page-break-inside: avoid;
    page-break-after: avoid;
    page-break-before: avoid;
  }
}

.col-9{
  display: block;
  float: left;
  width: 70%;
}
.col-9{
  display: block;
  float: left;
  width: 25%;
}
.clear{
  clear: both;
}

<div class="col-9">
  [a lot of text here, that no longer split mid-line.]
</div>
<div class="col-3">
  [a short sidebar here]
</div>
<div class="clear"></div>

For some reason, the tool could not handle the display: inline-block setup. It works with floats. I'm running version 0.12.4.

Mike Caputo
  • 36
  • 1
  • 4
1

I solved problem adding margin-top and margin-bottom, like this:

$this->get('knp_snappy.pdf')->generateFromHtml($html, $pdfFilepath, [
        'default-header' => false,
        'header-line' => false,
        'footer-line' => false,
        'disable-javascript' => true,
        'margin-top' => '3mm',
        'margin-bottom' => '3mm',
        'margin-right' => '5mm',
        'margin-left' => '5mm',
        'orientation' => 'Landscape',
    ], true);
0

The cut text problem is a known webkit problem and it seems developers found a solution inside wkhtmltopdf. Updating to 0.12.1 will fix the cut-text problem (if you don't want to waste time with compilations, you can just take the binary file from here: https://github.com/h4cc/wkhtmltopdf-amd64 ).

Dragos Rusu
  • 1,508
  • 14
  • 14
  • 1
    I'm using 0.12.2.1, which is a more updated version of wkhtmltopdf. I still seem to have this problem, so I don't think this is the fix (since I doubt they reintroduced the bug in a newer version). – rageandqq Feb 09 '15 at 15:46
  • I confirm 0.12.1 worked at the moment - didn't play with it since. – Dragos Rusu Feb 10 '15 at 17:09
  • Using wkhtmltopdf 0.12.2.1 (with patched qt), still the issue. – Alesis Joan Aug 04 '17 at 12:22
  • I have `wkhtmltopdf 0.12.6 (with patched qt)` and i still have the issue, but **only** when I use the `--margin-bottom` option. I think that whatever fix is done to properly decide where to split the pages, it is done **before** applying the margin options, and so the margin options mess it up. – Max Williams Apr 13 '23 at 09:32
0

Have been putting up with this for months and finally found a fix for my situation. I'm using the github css stylesheet in the html file I'm converting, and code blocks that span multiple pages get the text cut if. Nothing is missing, it's just cut in half.

Bottom of a page:

bottom of page

Start of next page:

start of next page

So in the github stylesheet overflow is set to auto for <pre> tags.

.markdown-body .highlight pre,
.markdown-body pre {
  padding: 16px;
  overflow: auto;
...

Switching the overflow property to hidden solved it for me!

.markdown-body .highlight pre,
.markdown-body pre {
  padding: 16px;
  overflow: hidden;

Think I tried all the other answers on this page, but this is solved for me. Hope it helps someone else out :)

jpmc
  • 1,147
  • 7
  • 18
0

I was able to find a workaround to this issue by installing wkhtmltox_0.12.6-1.bionic_amd64.deb (for Ubuntu) from https://github.com/wkhtmltopdf/packaging/releases/0.12.6-1

After updating this wkhtmltox package, the tables and text will not cut off at the end of the page anymore. This fix introduced a different issue for me, now the generated pdf has no styling. For example font-family, font-size or even text alignment are all gone, and are using some default setting.