6

I found a few examples showing how to use the XML to PDF using a iText XML document. But they are all for the older version 4.x. Is there any examples or can someone post an example of the required/updated code to do the same in version 5.x?

All the examples refere to code like this, but I can not find what to use to replace the ITextHandler class with in the new version.
http://www.ridgway.co.za/archive/2005/07/31/itextsharpxmltopdfexample.aspx

Document document = new Document();
PdfWriter.GetInstance(document, new FileStream("ExampleDoc.pdf", FileMode.Create));
ITextHandler xmlHandler = new ITextHandler(document);
xmlHandler.Parse("ExampleDoc.xml");

Also, I am not trying to go from HTML to PDF. The CSS styling never comes out as expected.

Editing to bump it up, really need some help here. Anyone at all?

John C
  • 1,761
  • 2
  • 20
  • 30

1 Answers1

3

iText's processing of XML files using a proprietary syntax was removed a very long time ago. See this and this for direct answers from the author. Instead you are encouraged to use the globally recognized XML standard which is XHTML.

I know you said that you don't want to use HTML because it never comes out correctly but maybe you could post some samples of what you're trying and we could help. Also, please make sure that you are using the XMLWorker and not the HTMLWorker. See these links for additional help/info when using it.

EDIT

This edit is in answer to @JohnC's comment

I can't speak for the iText team and their reasons but I can guess at things. PDF doesn't have "paragraphs", "words", "tables", etc. Instead, PDF has text, drawings (lines, patterns) and images. If you want to do these things manually you can use the raw PdfContentByte objects. You are encouraged, however, to use iText's abstractions like Paragraph and PdfPTable which use the PdfContentByte on your behalf.

For iText to support an XML format it would need first create its own propriety DTD and/or XML Schema. If any features get added it would need to then version the schema properly which can cause problems and confusions for consumers. Then it would need to build/maintain a parser that turned the XML abstractions into either iText abstractions or raw PDF commands. For the former, you have an abstraction talking to an abstraction which is just begging to break. For the latter, you now have two abstraction implementations that will run into feature parity issues eventually.

Further, what would the XML represent? Paragraphs, chunks of text, images and tables? Sounds like HTML already so there's no need to repeat that kind of schema. Or would it be "put content Z at coordinate X,Y using Font ABC"? That's where the PdfContentByte comes in. True, there could be a native parser but I'm guessing there just isn't too many people asking for one. Or would the XML be your own format based on your own data with things like <book> and <inventory>? If that's the case, then iText would really have no idea of how to style that either. You could, however, use leverage .Net/Java and XSLT to transform your XML into XHTML commands that it does know.

Community
  • 1
  • 1
Chris Haas
  • 53,986
  • 12
  • 141
  • 274
  • This first link just says he removed support for the DTD, it is not very clear if it was in remarks to the entire XML to PDF functionality. It seems to be easier to build a iText XML template with exact direction to tell PDF where things go, rather than XHTML and trial and error its understanding of limited CSS. – John C Feb 10 '14 at 15:07
  • @JohnC, I responded above. I don't speak for the iText team so this is just my own personal take. I'm not saying an XML representation would be bad and in fact I've done it myself. However, the iText team has to triage what to work on and I don't think many people are asking for this. If you can show a specific use-case to them they may respond with suggestions or might possibly add it as a feature in a future version. – Chris Haas Feb 10 '14 at 17:08
  • Well from what I saw that the iText XML was in the older version, was just that, put content Z at coordinate X,Y with Font ABC. My problem with xhtml is not the xhtml part, if it worked as expected then it would be great. Its the problem that its CSS translator does not translate as expected. Is there any "good" examples of code built PDFs using PdfContentByte or iText's abstractions like Paragraph and PdfPTable? – John C Feb 12 '14 at 14:52
  • Have you tried the XMLWorker? It really does a good job, especially if you're using the more simple CSS properties like font and colors. The Controlling Fonts link above shows off a very basic example of loading a CSS file and registering your fonts. One thing you can't do is absolutely position things since the coordinates of HTML and PDF use different origins. Otherwise, pretty much every example out there uses the abstractions like `Paragraph`. – Chris Haas Feb 12 '14 at 17:47
  • If you want to use the `PdfContentByte` there's an implicit rule that you've read the PDF spec, at least partially. The actual PDF spec doesn't support line breaks, bolding of text, tables, etc, you need to do these things manually. If you want, you can email me (see profile) and we can talk about your specific need. – Chris Haas Feb 12 '14 at 17:50
  • Not sure where I am supposed to find your email in your profile. I awarded you the bounty as you helped with the most and are willing to continue to help offline. – John C Feb 17 '14 at 22:21
  • I emailed you back, did you not receive my last reply? – John C Jun 23 '14 at 22:40
  • Sorry John, I didn't see it, when did you send it? – Chris Haas Jun 23 '14 at 23:24
  • Let me look, I will resend it. – John C Jun 23 '14 at 23:27
  • Took me a while to figure out what computer I sent it on, I found it and just resent it. – John C Jun 26 '14 at 22:49
  • Chris, did you get my email? – John C Jun 30 '14 at 17:39