I'm hoping I can forgo the history, but trust me on the following:
- I have several people who have immediate access to MSWord 2007
- We are trying to prep a generic Word document that can be passed from person to person over the course of several months and they can "add" new content to it.
Regardless of the answers below - the above will stay the same no matter how horrible an idea it is, or what better idea you may have... I've already been down this road :P.
- My 'thoughts' were to setup (within Word) an XML Schema so we could 'flag' the content for the specific content areas (e.g. item number, item description, item stem, item options, item answer, etc)
- I taught myself XML schema in a little under 6 hours, and apparently I'm a horrible teacher: I have the XML Schema file, I have imported it into Word, I am able to flag the areas as per all the online tutorials...
- I was HOPING to save out to an "XML" file (from Word) and have it look like:
<note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
(just pulled that off a random site to demonstrate I wanted to save out from the word document the XML structure with the data filled in)
The hope was I then could parse with Python, or send the XML file to a vendor who could then upload the information into a datebase (and no - we can't just upload to the database - it has to go from the Word Document to XML to the Vendor).
The problem: Whenever I save the file to XML from MSWord 2007 it gives me all this horrible horrible XML crap all over the place - I've checked to see if I could parse that, hoping to find my XML tags embedded, and I find them, but it's so garbled by all of Offices tags/crap that parsing it out would be a huge waste of time.
Finally: How can I have word automatically fill in the XML tags (and by automatically I understand that someone has to "select the text", "assign the XML"... talking more about the 'saving' out to an XML) from a schema I develop (or can I just create a sample XML tree without the schema?) and export the contents ready for uploading/parsing?
Thanks for reading my short novel :P (hope I was clear enough!)
-J