0

Sorry if this has been asked before.

I'd like to read and write DOCX/OpenXML files with C++ (or Objective-C++) on Mac. I'm not only interested in the text but also in the formatting (bold, italic, underline, strikethrough, font color).

I couldn't find any libraries that do that.

The official OpenOfficeXML SDK appears to be C# and Windows only.

Am I overlooking something obvious? Is [NSAttributedString initWithData...] my best option?

Edit: The other solutions suggest to (1) unzip the DOCX package and then work with the XML files or (2) they suggest using .NET and/or C#. Another solution that's mentioned is (3) DuckX.

(1) I don't want to write myself an XML parser to read the formatting information all by myself as I don't think that's the right way to go. (2) As the tags and title say I'm looking for a C++ solution that works on macOS (3) DuckX is great but doesn't support tables within Word documents

guitarflow
  • 2,930
  • 25
  • 38
  • 1
    Does this answer your question? [Reading .docx in C++](https://stackoverflow.com/questions/1161295/reading-docx-in-c) – L. Scott Johnson Jul 10 '20 at 12:41
  • Googling turns up https://github.com/amiremohamadi/DuckX – L. Scott Johnson Jul 10 '20 at 12:43
  • DuckX is great but doesn't support formatting from what I can see. Last time I checked tables within word documents were also not readable. Most of the answers in the other questions address Windows and .NET which is not an option as I need this within a C++ Mac app. – guitarflow Jul 10 '20 at 15:42
  • What do you want the library to do? You feed it a docx file, and get back... what? – n. m. could be an AI Jul 10 '20 at 16:23
  • @guitarflow duckX uses pugixml, so you should be able to dip into the rest of the XML via that, just like duckX does with its text-manipulating convenience functions. – L. Scott Johnson Jul 10 '20 at 16:38
  • @n.'pronouns'm. I'd like to have an attributed string from it. I want to display the text from the word file within my app but I don't need the plain text but also the text attributes. – guitarflow Jul 10 '20 at 17:03
  • The document tree or whatever you get from an XML parser is an attributed string. Displaying it is a different story altogether. They say [TextEdit](https://en.wikipedia.org/wiki/TextEdit) can read and display ooxml, maybe you want to look at its code. DuckX certainly cannot display anything or lay out any text, it's a very shallow wrapper over an XML document tree. – n. m. could be an AI Jul 10 '20 at 17:37

0 Answers0