Using C#, is there a good way to find and replace a text string in a docx file without having word installed on that machine?
2 Answers
Yes, using Open XML. Here's an article which addresses your specific question: Creating a Simple Search and Replace Utility for Word 2007 Open XML Format Documents
To work with this file format, one option is to use the Open XML Format Application Programming Interface (API) in the DocumentFormat.OpenXml.Packaging namespace. The classes, methods, and properties in this namespace are located in the DocumentFormat.OpenXml.dll file. You can install this DLL file by installing the Open XML Format SDK version 1.0. The members in this namespace allow you to easily work with the package contents for Excel 2007 workbooks, PowerPoint 2007 presentations, and Word 2007 documents.
...
Private Sub Search_Replace(ByVal file As String) Dim wdDoc As WordprocessingDocument = WordprocessingDocument.Open(file, True) ' Manage namespaces to perform Xml XPath queries. Dim nt As NameTable = New NameTable Dim nsManager As XmlNamespaceManager = New XmlNamespaceManager(nt) nsManager.AddNamespace("w", wordmlNamespace) ' Get the document part from the package. Dim xdoc As XmlDocument = New XmlDocument(nt) ' Load the XML in the part into an XmlDocument instance. xdoc.Load(wdDoc.MainDocumentPart.GetStream) ' Get the text nodes in the document. Dim nodes As XmlNodeList = Nothing nodes = xdoc.SelectNodes("//w:t", nsManager) Dim node As XmlNode Dim nodeText As String = "" ' Make the swap. Dim oldText As String = txtOldText.Text Dim newText As String = txtNewText.Text For Each node In nodes nodeText = node.FirstChild.InnerText If (InStr(nodeText, oldText) > 0) Then nodeText = nodeText.Replace(oldText, newText) ' Increment the occurrences counter. numChanged += 1 End If Next ' Write the changes back to the document. xdoc.Save(wdDoc.MainDocumentPart.GetStream(FileMode.Create)) ' Display the number of change occurrences. txtNumChanged.Text = numChanged End Sub
-
Thanks that definitely got me started. It looks like it is all based on System.IO.Packaging. Since this is fairly simple, can it be done without the Open XML Format SDK? – TimothyAWiseman Jul 30 '10 at 23:30
-
1Absolutely - I rarely use the SDK myself. I primarily program against PowerPoint (`PresentationML` and `DrawingML` as opposed to Word's `WordProcessingML`) using only `System.IO.Packaging` and Linq-to-XML. So I'll have to point you to a Ken Getz article: http://msdn.microsoft.com/en-us/library/bb738371(office.12).aspx. Look for any more of his articles written in 2006 - they all use `System.IO.Packaging`. After that, he started writing articles with the SDK. You can also check out http://www.openxmldeveloper.org – Todd Main Jul 30 '10 at 23:42
-
Awesome, thank you. Also, this reference by Vikas Goyal helped me tremendously both in getting the answer I needed and in (mostly) understanding what was going on in the process: http://www.devx.com/dotnet/Article/42221/1954 – TimothyAWiseman Aug 02 '10 at 04:30
-
@timothyawiseman: the devx article is a really good one. glad to hear this is working out for you. – Todd Main Aug 02 '10 at 05:50
You may also try Aspose.Words for .NET in order to find and replace text in Word document. This component doesn't require MS Office to be installed. The API is quite simple and easy to use and implement.
Disclosure: I work as developer evangelist at Aspose.

- 1,408
- 12
- 29