I need to convert .doc and .docx document format to pdf in the server side using .net core. I've searched for it, and it came to this question that has remarkable answered for .docx to pdf issue. It said that you have to convert it first to HTML format using OpenXMLPowerTools, and from HTML to pdf. And you may see in the answer, that there's a solution for the conversion from .doc to .docx, and that using b2xtranslator, a library to convert Microsoft Office binary files to Open XML format files. What I am missing here is the usage of this library. I can't find any sample how to use it to convert the .doc file, but only this comment on this question.
Based on that, I tried to use the library, but I met a dead end. This is my code:
//check file extension
FileInfo file = new FileInfo(textBox1.Text);
if (file.Extension == ".doc")
{
FileStream streamDocFile = new FileStream(file.FullName, FileMode.Open);
var fileDoc = new b2xtranslator.DocFileFormat.WordDocument(new b2xtranslator.StructuredStorage.Reader.StructuredStorageReader(streamDocFile));
var fileDocx = b2xtranslator.OpenXmlLib.WordprocessingML.WordprocessingDocument.Create(file.Name + "x", b2xtranslator.OpenXmlLib.OpenXmlPackage.DocumentType.Document);
b2xtranslator.WordprocessingMLMapping.Converter.Convert(fileDoc, fileDocx);
}
My questions are:
- How to write the .docx file? I don't know if the code is right or not, because I am confused about how to write it (fileDocx object) to file and to check it.
- How to pass .docx resulting in b2xtranslator, to Open-XML-PowerTools, so I can convert it into HTML format?
Thank you in advance.