1

In my C# application, I get a word document's XML code and I want to convert it to HTML using XslCompiledTransform, just like this answer and this one suggest.

But the problem is how to get or create the XSL stylesheet to use in this line:

var myXslTrans = new XslCompiledTransform(); 
myXslTrans.Load("stylesheet.xsl");  //<- How to get this
myXslTrans.Transform("source.xml","result.html");

In this tutorial, it shows how to create an XSL for an XML. But is it possible to do it programmatically for whatever XML you have?

It's a sample of what XML document I got:

<w:wordDocument xmlns:aml="http://schemas.microsoft.com/aml/2001/core" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml" xmlns:wx="http://schemas.microsoft.com/office/word/2003/auxHint" xmlns:wsp="http://schemas.microsoft.com/office/word/2003/wordml/sp2" xmlns:sl="http://schemas.microsoft.com/schemaLibrary/2003/core"
  w:macrosPresent="no"
  w:embeddedObjPresent="no"
  w:ocxPresent="no"
  xml:space="preserve">
  <w:ignoreSubtree
    w:val="http://schemas.microsoft.com/office/word/2003/wordml/sp2" />
  <o:DocumentProperties>
    <o:Author>Soft</o:Author>
    <o:LastAuthor>Soft</o:LastAuthor>
    <o:Revision>1</o:Revision>
    <o:TotalTime>0</o:TotalTime>
    <o:Created>2015-05-12T03:13:00Z</o:Created>
    <o:LastSaved>2015-05-12T03:13:00Z</o:LastSaved>
    <o:Pages>1</o:Pages>
    <o:Words>58</o:Words>
    <o:Characters>336</o:Characters>
    <o:Lines>2</o:Lines>
    <o:Paragraphs>1</o:Paragraphs>
    <o:CharactersWithSpaces>393</o:CharactersWithSpaces>
    <o:Version>12</o:Version>
  </o:DocumentProperties>
  <o:CustomDocumentProperties>
    <o:EDOID
      dt:dt="float">657360</o:EDOID>
  </o:CustomDocumentProperties>
  <w:fonts>
    <w:defaultFonts
      w:ascii="Calibri"
      w:fareast="Calibri"
      w:h-ansi="Calibri"
      w:cs="Arial" />
    <w:font
      w:name="Arial">
      <w:panose-1
        w:val="020B0604020202020204" />
      <w:charset
        w:val="00" />
      <w:family
        w:val="Swiss" />
      <w:pitch
        w:val="variable" />
      <w:sig
        w:usb-0="E0002AFF"
        w:usb-1="C0007843"
        w:usb-2="00000009"
        w:usb-3="00000000"
        w:csb-0="000001FF"
        w:csb-1="00000000" />
    </w:font>
    <w:font
      w:name="Symbol">
      <w:panose-1
        w:val="05050102010706020507" />
      <w:charset
        w:val="02" />
      <w:family
        w:val="Roman" />
      <w:pitch
        w:val="variable" />
      <w:sig
        w:usb-0="00000000"
        w:usb-1="10000000"
        w:usb-2="00000000"
        w:usb-3="00000000"
        w:csb-0="80000000"
        w:csb-1="00000000" />
    </w:font>
    <w:font
      w:name="Cambria Math">
      <w:panose-1
        w:val="02040503050406030204" />
      <w:charset
        w:val="01" />
      <w:family
        w:val="Roman" />
      <w:notTrueType />
      <w:pitch
        w:val="variable" />
      <w:sig
        w:usb-0="00000000"
        w:usb-1="00000000"
        w:usb-2="00000000"
        w:usb-3="00000000"
        w:csb-0="00000000"
        w:csb-1="00000000" />
    </w:font>
    <w:font
      w:name="Calibri">
      <w:panose-1
        w:val="020F0502020204030204" />
      <w:charset
        w:val="00" />
      <w:family
        w:val="Swiss" />
      <w:pitch
        w:val="variable" />
      <w:sig
        w:usb-0="E10002FF"
        w:usb-1="4000ACFF"
        w:usb-2="00000009"
        w:usb-3="00000000"
        w:csb-0="0000019F"
        w:csb-1="00000000" />
    </w:font>
    <w:font
      w:name="B Titr">
      <w:panose-1
        w:val="00000700000000000000" />
      <w:charset
        w:val="B2" />
      <w:family
        w:val="auto" />
      <w:pitch
        w:val="variable" />
      <w:sig
        w:usb-0="00002001"
        w:usb-1="80000000"
        w:usb-2="00000008"
        w:usb-3="00000000"
        w:csb-0="00000040"
        w:csb-1="00000000" />
...
  </w:body>
</w:wordDocument>

Update

Reading the comments makes me believe that it's not possible to create XSL for whatever XML document programmatically.

So how can I get XSl for this specific XML which is in my question? If I open this XML in Microsoft word it shows the real word document with all styles. I know there is HTML code for it as well. So I want to get the HTML code to show the same result in Microsoft word

Community
  • 1
  • 1
Ghasem
  • 14,455
  • 21
  • 138
  • 171
  • 1
    Is it docx XML? If yes, check out the HtmlConverter class in Power Tools. Refer https://msdn.microsoft.com/en-us/library/ff628051.aspx – potatopeelings May 30 '15 at 07:39
  • 1
    "*is it possible to [create an XSL] programmatically for whatever XML you have?*" No, not really. You need to know the structure of both the input and the output before you can map the flow of data from one to the other. – michael.hor257k May 30 '15 at 08:40
  • What's the actual question? How to convert DOCX to HTML? – dlask May 31 '15 at 08:31
  • @dlask No. It was an example. I want to use `XslCompiledTransform` for the Xml docs which doesn't have XSL. – Ghasem May 31 '15 at 09:25
  • Well, but you have to specify the required output somehow. How do you like to do it? – dlask May 31 '15 at 09:33
  • @dlask I'm not really sure if i'm following but I want the output `HTML` to show the result exactly the same as source `XML` – Ghasem May 31 '15 at 13:42
  • Oh yes, I see. So you want to create an HTML file that would *display* your XML *source code*. Is it so? Unfortunately it's hard to understand that from your original question. – dlask May 31 '15 at 13:49
  • @dlask Not the source code. the `HTML` result in browser should be the same as the `XML` result in the browser. – Ghasem May 31 '15 at 13:51
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/79254/discussion-between-dlask-and-alex-jolig). – dlask May 31 '15 at 13:58
  • You might be interested in [PowerTools for Open XML](http://powertools.codeplex.com/) – HadiRj Jun 02 '15 at 05:42
  • You need to explain more what you need (inputs, expected outputs). This is not clear at all. – Simon Mourier Jun 02 '15 at 06:27
  • @AlexJolig You are looking for something that cannot exist. First, there is an unlimited number of possible XML schemas, Next, there is no single way to present XML content as HTML; that's something the designer of the XSLT would have to decide upon. I have no idea what you mean by "*I want the output HTML to show the result exactly the same as source XML*" An XML has no "exact" way it should look in a browser, unless you mean to show the actual XML document as it is, markup and all (and even then there are nuances). ... – michael.hor257k Jun 02 '15 at 06:31
  • ... Your question would make sense if it were limited to XML documents conforming to the Microsoft Office Word 2003 XML schema, and asked to present them in a browser looking (as much as possible) the same as they would when opened by the Word application. If that's your goal, it might be accomplished more easily by asking Word to export the document as HTML. Or by using this tool https://www.microsoft.com/en-us/download/details.aspx?id=1109 (which I believe also contains an XSLT stylesheet you could extract and use on its own). – michael.hor257k Jun 02 '15 at 06:44
  • @michael.hor257k please read my **update** question. Hope that's more clear what I need – Ghasem Jun 02 '15 at 06:54
  • @AlexJolig "*Hope that's more clear what I need*' Not really. "*...get the HTML code to show the same result in Microsoft word*" Did you mean **in a browser**? – michael.hor257k Jun 02 '15 at 06:56
  • @michael.hor257k No. mentioning browser was a mistake. I want to open the html code in `Microsoft word` and get the same styled document I had with Xml code. – Ghasem Jun 02 '15 at 07:10
  • What HTML code? This is making less and less sense. – michael.hor257k Jun 02 '15 at 07:22
  • @michael.hor257k I'm sorry. I guess you never opened an xml or html code in `word document`. I mean the xml or html code of an word document – Ghasem Jun 02 '15 at 07:32
  • @AlexJolig Guess again. Even better: leave my personal experience out of the discussion and concentrate on making your intentions clear. Right now, you are doing a very poor job of it. The document shown in your question is XML, not HTML. You can open it directly in any application that supports WordML (such as Microsoft Word). – michael.hor257k Jun 02 '15 at 07:51
  • @michael.hor257k Exactly! I want to convert that XML to HTML, the way Microsoft word still be able to open it as an word document – Ghasem Jun 02 '15 at 08:03
  • If you convert that XML to HTML, then it will no longer be a Word document. That doesn't mean Word won't be able to open it. It is still not clear what *kind* of HTML document you want to produce (and for what purpose). – michael.hor257k Jun 02 '15 at 08:15
  • @michael.hor257k If you copy the whole word to the clipboard and them get it with `Format.Html` it will give you the html which makes the same result as Xml in a `word document`. That's the kind of html I want – Ghasem Jun 02 '15 at 08:24
  • 1
    If you want the same conversion that MS Word does, you should look at the link I posted earlier. I am not a Windows user, but [I am told](http://blogs.msdn.com/b/brian_jones/archive/2005/09/30/475794.aspx) you can extract the XSLT from there and use it on its own. – michael.hor257k Jun 02 '15 at 08:31
  • @michael.hor257k Ok. Thanks. I will try it and see if it helps – Ghasem Jun 02 '15 at 08:39

3 Answers3

2

So how can I get XSL for this specific XML which is in my question?

Use Microsoft Words XSL file:
C:\Program Files\Microsoft Office\Office15\XML2WORD.XSL

Jeremy Thompson
  • 61,933
  • 36
  • 195
  • 321
0

Can't you use XSL-Parameters for your dynamic work?

https://msdn.microsoft.com/en-us/library/dfktf882%28v=vs.110%29.aspx

You could pass some parameter for switching through some xsl-statements.

Sebastian G. Marinescu
  • 2,374
  • 1
  • 22
  • 33
-1

Updated answer:

As we know XSLT is an XML document, we can simply create an xml file with xslt attritubes and extension. we can follow this strategy.

... 
XDocument document = documentBuilder.newDocument();

Element rootElement = document.createElement("xsl:stylesheet");
// adding attributes like namespaces etc... 

document.appendChild(rootElement); 
Element em = document.createElement("xsl:template");
em.setAttribute("match", "/"); 
....

This is just an idea. You can design your xslt document like this and follow the correct syntax.Apology for not understanding question in the first answer.Hope this helps

killer
  • 592
  • 1
  • 9
  • 31