We have an MVC web app that allows downloading dynamically generated PDF reports. I am trying to allow viewing the report in the browser, and because of browser compatibility issues, we can't use a JS PDF viewer, so am working on a controller action that generated the PDF using existing code, then converts it to HTML using a third party program and returns the HTML.
The third party program, pdf2htmlEX, is used via a command line interface, but when I try to invoke the program to convert the PDF to HTML nothing happens. I do not receive an error, but no HTML file is generated.
I first tried just a single line to start the conversion Process.Start("commands here")
, but when that didn't work I tried a more advanced way to start the process and allow capturing the StdOut found on this answer: How To: Execute command line in C#, get STD OUT results, but I don't seem to be getting any output either. I am not familiar with invoking command line programs using c#, so I am not sure if I am making a simple mistake. My current controller action looks like this:
public ActionResult GetReportPdfAsHtml(int userId, ReportType reportType, int page = 1)
{
// get pdf
var pdfService = new PdfServiceClient();
var getPdfResponse = pdfService.GetPdfForUser(new GetPdfForUserRequest {
UserId = userId,
ReportType = reportType,
BaseUri = Request.Url.Host
});
pdfService.Close();
// save pdf to temp location
var folderRoot = Server.MapPath("~");
var location = Path.Combine(folderRoot, "pdfTemp");
var outputDir = $"{location}\\output";
var fileName = $"{userId}_{reportType}";
Directory.CreateDirectory(outputDir);
var file = $"{location}\\{fileName}.pdf";
//IOFile is alias of system.IO.File to avoid collision with the 'File' Method already on the controller
IOFile.WriteAllBytes(file, getPdfResponse.Pdf);
//********************************************************************
//***** Works fine up to here, PDF is successfully generated and saved
//********************************************************************
// Convert pdf above to html
var arguments = $"{file} --dest-dir {outputDir} -f {page} -l {page}";
// Start the child process.
var p = new Process {
StartInfo = {
UseShellExecute = false,
RedirectStandardOutput = true,
FileName = Server.MapPath("~\\pdf2htmlEX.exe"),
Arguments = arguments
}
};
p.Start();
// Read the output stream first and then wait.
var output = p.StandardOutput.ReadToEnd();
p.WaitForExit();
// Function continues and returns fine, but MVC then errors because the
// file isn't created so the path below doesn't exist
return File($"{outputDir}\\{fileName}.html", "text/html");
}
Update: I have tried running the command in a cmd console and it works fine. However when I try and run it via the Process.Start()
method, i get following output from Pdf2htmlEX:
>temporary dir: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244
>Preprocessing: 0/1
>Preprocessing: 1/1
>Add new temporary file: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/__css
>Add new temporary file: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/__outline
>Add new temporary file: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/__pages
>Working: 0/1
>Install font 1: (14 0) SUBSET+LatoLightItalic
>Add new temporary file: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/f1.ttf
>Embed font: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/f1.ttf 1
>Add new temporary file: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/__raw_font_1.ttf
>em size: 2000
>Add new temporary file: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/f1.map
>Missing space width in font 1: set to 0.5
>space width: 0.5
>Add new temporary file: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/__tmp_font1.ttf
>Add new temporary file: [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/__tmp_font2.ttf
>Internal Error: Attempt to output 2147483647 into a 16-bit field. It will be truncated and the file may not be useful.
>Internal Error: File Offset wrong for ttf table (name-data), -1 expected 150
>Save Failed
>Cannot save font to [Redacted]\pdfTemp\temp/pdf2htmlEX-a46244/__tmp_font1.ttf