How can I convert XHTML nested list to pdf with iText?

Question

I have XHTML content, and I have to create from this content a PDF file on the fly. I use iText pdf converter. I tried the simple way, but I always get bad result after calling the XMLWorkerHelper parser.

XHTML: <ul> <li>First <ol> <li>Second</li> <li>Second</li> </ol> </li> <li>First</li> </ul>

The expected value:

First
1. Second
2. Second
First

PDF result:

First Second Second
First

In the result there is no nested list. I need a solution for calling the parser, and not creating an iText Document instance.

score 3 · Answer 1 · answered Nov 05 '14 at 13:01

Please take a look at the example NestedListHtml

In this example, I take your code snippet list.html:

<ul>
  <li>First
    <ol>
      <li>Second</li>
      <li>Second</li>
    </ol>
  </li>
  <li>First</li>
</ul>

And I parse it into an ElementList:

// CSS
CSSResolver cssResolver =
    XMLWorkerHelper.getInstance().getDefaultCssResolver(true);

// HTML
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
htmlContext.autoBookmark(false);

// Pipelines
ElementList elements = new ElementList();
ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null);
HtmlPipeline html = new HtmlPipeline(htmlContext, end);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);

// XML Worker
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
p.parse(new FileInputStream(HTML));

Now I can add this list to the Document:

for (Element e : elements) {
    document.add(e);
}

Or I can list this list to a Paragraph:

Paragraph para = new Paragraph();
for (Element e : elements) {
    para.add(e);
}
document.add(para);

You will get the desired result as shown in nested_list.pdf

You can not add nested lists to a PdfPCell or to a ColumnText. For instance: this will not work:

PdfPTable table = new PdfPTable(2);
table.addCell("Nested lists don't work in a cell");
PdfPCell cell = new PdfPCell();
for (Element e : elements) {
    cell.addElement(e);
}
table.addCell(cell);
document.add(table);

This is due to a limitation in the ColumnText class that has been there for many years. We have evaluated the problem and the only way to fix this, would be to rewrite ColumnText entirely. This is not an item on our current technical road map.

I need to fill an existing PDF with richtext data, I was allowed to render the overflown data in a new page, how can I do that without using ColumnText go. For document.add, what happened to the cutoff text, how can I split the text to fit in pages. I'm sorry if this question is too obvious to you, I just started using itext. — user1541389, Jan 13 '17 at 03:10
I don't understand what you don't understand. Please create a new question and include an [SSCCE](http://sscce.org) that explains what you're trying to do, and explain why you think that it doesn't work. — Bruno Lowagie, Jan 13 '17 at 08:46

score 0 · Answer 2 · edited Jan 31 '19 at 17:27

Here's a workaround for nested ordered and un-ordered lists.

The rich Text editor I am using giving the class attribute "ql-indent-1/2/2/" for li tags, based on the attribute adding ul/ol starting and ending tags.

public String replaceIndentSubList(String htmlContent) {
    org.jsoup.nodes.Document document = Jsoup.parseBodyFragment(htmlContent);
    Elements element_UL = document.select("ul");
    Elements element_OL = document.select("ol");
    if (!element_UL.isEmpty()) {
        htmlContent = replaceIndents(htmlContent, element_UL, "ul");
    }
    if (!element_OL.isEmpty()) {
        htmlContent = replaceIndents(htmlContent, element_OL, "ol");
    }
    return htmlContent;
}


public String replaceIndents(String htmlContent, Elements element, String tagType) {
    String attributeKey = "class";
    String startingULTgas = "<" + tagType + ">";
    String endingULTags = "</" + tagType + ">";
    int lengthOfQLIndenet = new String("ql-indent-").length();
    HashMap<String, String> startingLiTagMap = new HashMap<String, String>();
    HashMap<String, String> lastLiTagMap = new HashMap<String, String>();
    Pattern regex = Pattern.compile("ql-indent-\\d");
    HashSet<String> hash_Set = new HashSet<String>();
    Elements element_Tag = element.select("li");
    for (org.jsoup.nodes.Element element2 : element_Tag) {
        org.jsoup.nodes.Attributes att = element2.attributes();
        if (att.hasKey(attributeKey)) {
            String attributeValue = att.get(attributeKey);
            Matcher matcher = regex.matcher(attributeValue);
            if (matcher.find()) {
                if (!startingLiTagMap.containsKey(attributeValue)) {
                    startingLiTagMap.put(attributeValue, element2.toString());
                }
                hash_Set.add(matcher.group(0));
                if (!startingLiTagMap.get(attributeValue)
                        .equalsIgnoreCase(element2.toString())) {
                    lastLiTagMap.put(attributeValue, element2.toString());
                }
            }
        }
    }
    System.out.println(htmlContent);
    Iterator value = hash_Set.iterator();
    while (value.hasNext()) {
        String liAttributeKey = (String) value.next();
        int noOfIndentes = Integer
                .parseInt(liAttributeKey.substring(lengthOfQLIndenet));
        if (noOfIndentes > 1)
            for (int i = 1; i < noOfIndentes; i++) {
                startingULTgas = startingULTgas + "<" + tagType + ">";
                endingULTags = endingULTags + "</" + tagType + ">";
            }
        htmlContent = htmlContent.replace(startingLiTagMap.get(liAttributeKey),
                startingULTgas + startingLiTagMap.get(liAttributeKey));
        if (lastLiTagMap.get(liAttributeKey) != null) {
            System.out.println("Inside last Li Map");
            htmlContent = htmlContent.replace(lastLiTagMap.get(liAttributeKey),
                    lastLiTagMap.get(liAttributeKey) + endingULTags);
        }
        else {
            htmlContent = htmlContent.replace(startingLiTagMap.get(liAttributeKey),
                    startingLiTagMap.get(liAttributeKey) + endingULTags);
        }
        startingULTgas = "<" + tagType + ">";
        endingULTags = "</" + tagType + ">";
    }
    System.out.println(htmlContent);[enter image description here][1]
    return htmlContent;
}

How can I convert XHTML nested list to pdf with iText?

2 Answers2

Linked

Related