2

I'm trying to publish a cshtml file to PDF but when the PDF renders all the html formatting is lost. I think the problem might be that I need to render view as a string like in this example here Render View As String but I'm not using MVC and I don't understand the process well enough to determine how I can extrapolate from this example. How do I get the view to render so that I don't lose the HTML formatting?

Here's how the code is set up:

public class PrintTemplate<T> : RazorEngine.Templating.TemplateBase<T>
{
    public new T Model { get; set; }

    public PrintTemplate()
    {
        //TODO: Add Constructor Logic
    }
}

public class ViewPage
{
    public string Body { get; set; }
}

public static class PrintPDFBO
{
    public static ViewPage PrintPDF(id)
    {
        var newPrint = new ViewPage();
        var pdf = GetDataForPDF(id);
        newPrint.Body += RazorEngine.Razor.Parse(PrintPDFUtil.GetPrintTemplate(id), pdf, id.ToString());
        newPrint.Body += "</body></html>";
        return newPrint;
    }
}

protected void btnPrintPDF_OnClick(object sender, EventArgs e)
{
    var content = new ViewPage();
    content = PrintPDFBO.PrintPDF(id);
    title = DateTime.Now + "My Title";
}

UPDATE: I've tried depositing the text from my view into a panel then outputting the panel but            
        the result is the same, no formatting
protected void PrintablePdf(ViewPage view, string title)
{
    Response.Clear();
    Response.Buffer = true;
    Response.ContentType = "application/pdf";
    Response.AddHeader("content-disposition", "attachment;fileName=" + title);
    Response.Cache.SetCacheability(HttpCacheability.NoCache);
    //StringBuilder sb = new StringBuilder(view.Body);
    divPrint.InnerHtml = view.Body.ToString();
    StringWriter sw = new StringWriter(sb);
    HtmlTextWriter hw = new HtmlTextWriter(sw);
    pnlPrint.RenderControl(hw);
    StringReader sr = new StringReader(sw.ToString());
    Document pdf = new Document(PageSize.A4, 50f, 50f, 50f, 50f);
    HTMLWorker htmlparser = new HTMLWorker(pdf);
    PdfWriter.GetInstance(pdf, Response.OutputStream);
    pdf.Open();
    htmlparser.Parse(sr);
    pdf.Close();
}

UPDATE for expected output:

Content of the cshtml:

@using Print.DataType
@using Print.Data
@inherits PrintTemplate<PDFPrint>
@*Start*@
<div style="border: 1px solid black; width: 7in; height: 2in;">
  <div style="width: 3.5in; height: 2in; padding: 1em; float: left;">
    <div>
        <div style="float:left; width: 2.5in;">
            <div style="border-bottom: 1px solid black; border-right: 1px solid black; height: .3in; padding-top: .25em;">
                <span style="font-weight: bold;">OPERATOR</span>
            </div>
            <div style="border-right: 1px solid black; height: .27in;">
                <div style="vertical-align: top;">NAME OF OPERATOR</div>
                <div>@Model.Name</div>
            </div>
        </div>
        <div style="float: left; width: 1in;">
            <div style="border-bottom: 1px solid black;">
                <div style="vertical-align: top;">CARD NO.</div>
                <div>@Model.CardNo</div>
            </div>
            <div style="border-bottom: 1px solid black;">
                <div style="vertical-align: top;">DATE ISSUED</div>
                <div>@Model.IssueDate.ToShortDateString()</div>
            </div>
            <div>
                <div style="vertical-align: top;">DATE EXPIRES</div>
                <div>@Model.Expiration.ToShortDateString()</div>
            </div>
        </div>
    </div>

What I expect to see in the pdf is a division with a solid border, multiple lines each with a border, bolded text in some instances, and multiple inner divisions that have specific widths.

What I get instead is just this, no formatting:

Name Date Time

However, the string of html is intact when it gets to the string builder so Razor is outputting it correctly.

UPDATE - Implementation of New Page:

So I found a post that talked about outputting an asp.net Panel to PDF and one person suggested that the two ways it could work was to make a new page put the content in the panel then try to print to PDF or do it as a stream on the server. So I decided to move my code to a new page so at the very least I could see on a page the output Razor generated from the cshtml page and determine if it was in fact intact which it is, all the border styles and font changes and widths/heights seem to be intact. Then from there I tried to do a normal PDF print of the panel and still lost all formatting once I printed to PDF. The one piece of code I've added is just a function call by the button that calls the PrintablePdf() function and on page load, I've added a line so that when content is populated it is added to the panel like so: divPrint.InnerHtml = content.Body;

UPDATE: (no resolution) Based on the first suggestion below I changed Printable PDF to this: (Correction here I typed StringBuilder when it should have read StringReader)

protected void PrintablePdf(string title, string body)
{
    Response.Clear();
    Response.Buffer = true;
    Response.ContentType = "application/pdf";
    Response.AddHeader("content-disposition", "attachment;fileName=" + title);
    Response.Cache.SetCacheability(HttpCacheability.NoCache);

    Document pdf = new Document(PageSize.A4, 50f, 50f, 50f, 50f);
    HTMLWorker htmlparser = new HTMLWorker(pdf);
    PdfWriter.GetInstance(pdf, Response.OutputStream);
    pdf.Open();
    htmlparser.Parse(new StringReader(body));
    pdf.Close();
}

FINAL UPDATE RESOLUTION:

In the end, nothing I tried using the CSHTML worked to preserve the layout in the PDf the way I needed it to. I finally had to resort to dynamically creating the PDF's in codebehind using ITextSharp's PdfPTable, PdfPCell and other features to manual build the pdf. I'm not thrilled with the sheer number of nested tables required to pull off the layout I needed and the code looks horrendously complex however I was able to reduce some portions to reusable method calls.

Community
  • 1
  • 1
Elaine K
  • 507
  • 2
  • 7
  • 32
  • What do you mean by "html formatting"? The Razor.Parse will return an HTML formatted string based on the contents of the template you pass it. What is it doesn't incorrectly supply an input and actual output versus expected output. – Ben Robinson Nov 18 '14 at 16:34
  • I've update the information to include the cshtml and the expected output, thanks. – Elaine K Nov 18 '14 at 18:20
  • Which library are you using to generate PDF? – Floremin Nov 19 '14 at 18:19
  • I'm using iTextSharp.text, iTextSharp.html, and iTextSharp.pdf on the page. – Elaine K Nov 20 '14 at 13:39

3 Answers3

0

I haven't used iTextSharp before, but in the examples I've seen online, they don't use all of the many writers you have. I'm suspecting that one of those is stripping out the HTML.

Can you try a simpler path from StringBuilder to the html parser?

htmlparser.Parse(new StringReader(sb.ToString()));

These two pages seem to have had the same issue, but said they found a resolution. Their code is similar to yours except for the simplification of the StringReader.

http://forums.asp.net/t/1970922.aspx?iTextSharp+PDF+formatting+problems+from+HTML+tags

ITextSharp HTML to PDF?

EDIT: It seems that the class you are using, HtmlParser, is deprecated. (http://api.itextpdf.com/itext/com/itextpdf/text/html/simpleparser/HTMLWorker.html) The recommendation is to use XMLWorker.

Here is an example from http://demo.itextsupport.com/xmlworker/itextdoc/flatsite.html

Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document,
    new FileOutputStream("results/loremipsum.pdf"));
document.open();
XMLWorkerHelper.getInstance().parseXHtml(writer, document,
    new FileInputStream("/html/loremipsum.html"));
document.close();

Also, look at this post, which talks about the special steps needed to load css, if you plan to use it. Replacing HTMLWorker with XML Worker in iTextSharp

Community
  • 1
  • 1
jlee-tessik
  • 1,510
  • 12
  • 16
  • I did try what's suggested here; however, the result is the same, I get the data but no formatting. See updated example of how I changed it. – Elaine K Nov 25 '14 at 21:41
  • Sorry to ask an obvious question, but in your updated PrintablePdf, have you confirmed that the body param contains the HTML tags? Can you pass in a simpler html blurb to see if that strips it out as well? – jlee-tessik Nov 27 '14 at 18:44
  • So I did look, and the body param just had html that started and ended with
    tags whether I passed it as a string directly to the parser or if I passed it to the asp panel then rendered that. So I tried adding it to the body param so it's formatting like this: "
    ...
    " but the result still the same.
    – Elaine K Nov 28 '14 at 00:13
0
 using (var srHtml = new StringReader(ConvertedString))
            {
                //Parse the HTML

                hw.Parse(srHtml);

            }

this is what worked for me user StirngReader instad of StringBuilder

Akshay Randive
  • 384
  • 2
  • 10
-1

I've had some luck with a library called Spire PDF

There is a free version, just check in nuget. That link has instructions on how to convert an html string to a PDF. There is also the ability to pass it a URL and get back a PDF. Hope you find some use out of this.

tarnold86
  • 397
  • 1
  • 4
  • 13