3

I have been using iTextSharp to do a HTML to PDF conversion, overall it works fairly well, but it doesn't seem to be like most of the formatting.

Bold, Italic, and Underline are all working, however, none of the font sizes, styles or other information is respected, therefore the export doesn't look much at all like the HTML that was used to create the format.

Does anyone know how to either

  • fix the way the iTextSharp exports (below is a sample of my code)
  • Or know of a different product that is out there that provides this functionality, and will not break the bank?

This is my code:

//Do the PDF thing
Document document = new Document(PageSize.A4);
using (Stream output = new FileStream(Server.MapPath(relDownloadDoc), FileMode.Create, FileAccess.Write, FileShare.None))
using (Stream htmlStream = new FileStream(Server.MapPath(relProcessingDoc), FileMode.Open, FileAccess.Read, FileShare.Read))
using (XmlTextReader reader = new XmlTextReader(htmlStream))
{
    reader.WhitespaceHandling = WhitespaceHandling.None;
    PdfWriter.GetInstance(document, output);
    document.Open();
    Console.ReadLine();
    HtmlParser.Parse(document, reader);
    document.Close();
}
Joris Schellekens
  • 8,483
  • 2
  • 23
  • 54
Mitchel Sellers
  • 62,228
  • 14
  • 110
  • 173

4 Answers4

3

Try WKHTMLTOPDF.

It's an open source implementation of webkit. Both are free.

We've set a small tutorial here

Mic
  • 24,812
  • 9
  • 57
  • 70
1

From Convert HTML + CSS to PDF with PHP? I found out about Prince XML, which has clients for lots of languages including the .Net platform.

It is an exceptional converter though commercial and not cheap. There is a Google Tech Talk about it. Allegedly, Google uses it for Google Docs. It's rendering engine also passed the Acid2 test.

If you want high-quality HTML to PDF conversion and are willing to spend the ~$3800 for a server license then look no further. Frankly I think the cost in time of getting anything else to do what Prince does will quickly outstrip the cost involved. Developer time is expensive.

Community
  • 1
  • 1
cletus
  • 616,129
  • 168
  • 910
  • 942
0

I have used pd4ml for a few things. It seems to work pretty well.

Here is a list html tags/attributes that pd4ml supports: http://pd4ml.com/html.htm

Matt MacLean
  • 19,410
  • 7
  • 50
  • 53
0

ActivePDF is $375 for a single server license, and does an excellent job. We've used in in client projects before and it's been great.

http://www.activepdf.com/products/serverproducts/webgrabber/index.cfm

EDIT: Nevermind, it depends on another one of their products that costs $1,400. Thought it would roll in cheaper than some of the other suggestions. A few more minutes of research came up with the following alternatives:

Under $500:

http://www.websupergoo.com/abcpdf-1.htm (You'll need the professional edition to keep as much formatting as possible).

Cᴏʀʏ
  • 105,112
  • 20
  • 162
  • 194
  • I was looking at abcpdf, the downside there is it needs registry access....my project is on a shared hosting provider so it doesn't work, which sucks! – Mitchel Sellers Apr 09 '09 at 14:27
  • 2
    From their docs, you CAN use it without registry access, but you have to specify the license number when creating the document object: http://www.websupergoo.com/helppdf7net/default.html. – patmortech Apr 21 '09 at 07:15