2

Using HTML file, I generated PDF file using iText pdfHTML. Now I want to add table Of Content(TOC) to the 2nd page. I saw the same question adding-toc-dynamically . But no answer for this. I tried the same scenarios as he followed. I want to know how I can get page numbers for the TOC? How I can add TOC using pdfHTML? Is it possible to do?

RedWhite
  • 29
  • 2
  • 1
    If there is no answer on the duplicate post, try to improve it. You may want to contact the authors of said library to get support. – Sven Mawby Apr 08 '20 at 09:19

1 Answers1

0

The answer in the duplicated question (HTML to PDF adding a table of contents (TOC) dynamically) hasn't had any upvotes after a while so I cannot close this question as a duplicate, so posting the answer here:

My answer is in Java but you can easily convert it to .NET because Jsoup version I will be using is embedded into iText, and the rest of the conversion is basically changing method names to start from an uppercase letter.

It's possible now to generate a table of contents with pure pdfHTML 3.0.3+, but this will naturally require some preprocessing of your HTML file.

The idea is that we will loop over the elements that are marked with data-toc and create corresponding Table of Contents elements. The most interesting part is generating page numbers that indicate pages where the content we refer to is placed. We are going to use target-counter CSS function for that, and we are also going to mark all the elements with data-toc attributes with a unique ID to be able to refer to them both in the context of target-counter and also jump to those elements in pure links.

Here is an example code with some helper comments:

Document htmlDoc = Jsoup.parse(new File("path/to/in.html"), "UTF-8");

// This is our Table of Contents aggregating element
Element tocElement = htmlDoc.body().prependElement("div");
tocElement.append("<b>Table of contents</b>");

// We are going to build a complex CSS
StringBuilder tocStyles = new StringBuilder().append("<style>");

Elements tocElements = htmlDoc.select("[data-toc]");
for (Element elem : tocElements) {
    // Here we create an anchor to be able to refer to this element when generating page numbers and links
    String id = UUID.randomUUID().toString();
    elem.attr("id", id);

    // CSS selector to show page numebr for a TOC entry
    tocStyles.append("*[data-toc-id=\"" + id + "\"] .toc-page-ref::after { content: target-counter(#" + id + ", page) }");

    // Generating TOC entry as a small table to align page numbers on the right
    Element tocEntry = tocElement.appendElement("table");
    tocEntry.attr("style", "width: 100%");
    Element tocEntryRow = tocEntry.appendElement("tr");
    tocEntryRow.attr("data-toc-id", id);
    Element tocEntryTitle = tocEntryRow.appendElement("td");
    tocEntryTitle.appendText(elem.attr("data-toc"));
    Element tocEntryPageRef = tocEntryRow.appendElement("td");
    tocEntryPageRef.attr("style", "text-align: right");
    // <span> is a placeholder element where target page number will be inserted
    // It is wrapped by an <a> tag to create links pointing to the element in our document
    tocEntryPageRef.append("<a href=\"#" + id + "\"><span class=\"toc-page-ref\"></span></a>");
}


tocStyles.append("</style>");

htmlDoc.head().append(tocStyles.toString());

String html = htmlDoc.outerHtml();

HtmlConverter.convertToPdf(html, new FileOutputStream("path/to/out.pdf"));

Visual representation of the result I got using the above code and your example file:

result

Alexey Subach
  • 11,903
  • 7
  • 34
  • 60