0

We have an MVC web app that allows downloading dynamically generated PDF reports. I am trying to allow viewing the report in the browser, and because of browser compatibility issues, we can't use a JS PDF viewer, so am working on a controller action that generated the PDF using existing code, then converts it to HTML using a third party program and returns the HTML.

The third party program, pdf2htmlEX, is used via a command line interface, but when I try to invoke the program to convert the PDF to HTML nothing happens. I do not receive an error, but no HTML file is generated.

I first tried just a single line to start the conversion Process.Start("commands here"), but when that didn't work I tried a more advanced way to start the process and allow capturing the StdOut found on this answer: How To: Execute command line in C#, get STD OUT results, but I don't seem to be getting any output either. I am not familiar with invoking command line programs using c#, so I am not sure if I am making a simple mistake. My current controller action looks like this:

public ActionResult GetReportPdfAsHtml(int userId, ReportType reportType, int page = 1)
{
    // get pdf
    var pdfService = new PdfServiceClient();
    var getPdfResponse = pdfService.GetPdfForUser(new GetPdfForUserRequest {
        UserId = userId,
        ReportType = reportType,
        BaseUri = Request.Url.Host
    });
    pdfService.Close();

    // save pdf to temp location
    var folderRoot = Server.MapPath("~");
    var location = Path.Combine(folderRoot, "pdfTemp");
    var outputDir = $"{location}\\output";
    var fileName = $"{userId}_{reportType}";
    Directory.CreateDirectory(outputDir);
    var file = $"{location}\\{fileName}.pdf";
    //IOFile is alias of system.IO.File to avoid collision with the 'File' Method already on the controller
    IOFile.WriteAllBytes(file, getPdfResponse.Pdf);

    //********************************************************************
    //***** Works fine up to here, PDF is successfully generated and saved
    //******************************************************************** 

    // Convert pdf above to html
    var arguments = $"{file} --dest-dir {outputDir} -f {page} -l {page}";
    // Start the child process.
    var p = new Process {
        StartInfo = {
            UseShellExecute = false,
            RedirectStandardOutput = true,
            FileName = Server.MapPath("~\\pdf2htmlEX.exe"),
            Arguments = arguments
        }
    };
    p.Start();
    // Read the output stream first and then wait.
    var output = p.StandardOutput.ReadToEnd();
    p.WaitForExit();

    // Function continues and returns fine, but MVC then errors because the 
    // file isn't created so the path below doesn't exist
    return File($"{outputDir}\\{fileName}.html", "text/html");
}

Update: I have tried running the command in a cmd console and it works fine. However when I try and run it via the Process.Start() method, i get following output from Pdf2htmlEX:

>temporary dir: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244  
>Preprocessing: 0/1   
>Preprocessing: 1/1  
>Add new temporary file: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/__css  
>Add new temporary file: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/__outline  
>Add new temporary file: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/__pages  
>Working: 0/1  
>Install font 1: (14 0) SUBSET+LatoLightItalic  
>Add new temporary file: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/f1.ttf  
>Embed font: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/f1.ttf 1  
>Add new temporary file: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/__raw_font_1.ttf  
>em size: 2000  
>Add new temporary file: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/f1.map  
>Missing space width in font 1: set to 0.5  
>space width: 0.5  
>Add new temporary file: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/__tmp_font1.ttf  
>Add new temporary file: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/__tmp_font2.ttf  
>Internal Error: Attempt to output 2147483647 into a 16-bit field. It will be truncated and the file may not be useful.  
>Internal Error: File Offset wrong for ttf table (name-data), -1 expected 150  
>Save Failed  
>Cannot save font to [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/__tmp_font1.ttf
Anduril
  • 1,236
  • 1
  • 9
  • 33
  • Try impersonating that pdf conversion code bits as explained [here](https://stackoverflow.com/questions/125341/how-do-you-do-impersonation-in-net/7250145#7250145) or [here](https://support.microsoft.com/en-in/help/306158/how-to-implement-impersonation-in-an-asp-net-application) and check if that works. – Siva Gopal Aug 24 '17 at 08:53
  • You may consider using [Rotativa](https://github.com/webgio/Rotativa) html to pdf converter, if it satisfy your requirements and it does not need to kick-in process for the conversion process. – Siva Gopal Aug 24 '17 at 10:18
  • @SivaGopal Thank you for the links. I tried the user impersonation but sadly it didn't resolve the issue (still useful code though, will remember it for the future). I looked at Rotativa, but it only seems to do html -> pdf, however, I would like pdf -> html – Anduril Aug 24 '17 at 10:45

0 Answers0