0

I am using following function in C# to find & replace in word doc using open xml sdk.

// To search and replace content in a document part.
public static void SearchAndReplace(string document)
{
    using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(document, true))
    {
        string docText = null;
        using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
        {
            docText = sr.ReadToEnd();
        }

        Regex regexText = new Regex("Findme");
        docText = regexText.Replace(docText, "Before *&* After"); // <-- Replacement text

        using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
        {
            sw.Write(docText);
        }
    }
}

Code works fine for most of the cases but the case above where replace text has "&" there is an error while opening the document..

error says : The fileXYZ cannot be opened because there are problem with contents. details : Illegal name character.

This error also persisted when i used "<w:br/>" in replace string to insert new line at the end of the string. But i removed the error by adding a space after "<w:br/> ".

PS: "replace text" is in indicated with a comment.

Kirk Woll
  • 76,112
  • 22
  • 180
  • 195
mribot
  • 519
  • 1
  • 11
  • 29
  • 2
    Did you try "Before & After"? – Kirk Woll Nov 01 '13 at 05:40
  • I tried it now & yes its working.. Can you please explain why it happens.. Actually replace text will be dynamic i.e. i wont be knowing where is the position of "&" also there are too many texts to replace so it is also not feasible to search position of "&" in every string.. Are there any other cases where this type of problem persists?? – mribot Nov 01 '13 at 05:49
  • It's a special character. All special characters in XML must be escaped. See [this question](http://stackoverflow.com/questions/7248958/which-are-the-html-and-xml-special-characters). – Kirk Woll Nov 01 '13 at 05:51
  • So how can we do it if text is dynamic.. Will i have to go for regular expression or is there some other simpler way.. – mribot Nov 01 '13 at 06:18
  • 1
    You can just always wrap your replacement text through `HttpUtility.HtmlEncode`. (though it says "HtmlEncode" it works just as well for XML) This way, the text will always be escaped if necessary. – Kirk Woll Nov 01 '13 at 06:23
  • Yes its working now bt problem is with " (double quotes) because it cant be stored in string so we can't apply HtmlEncode.. eg : string abc = " This is "actual" text"; is there any solution to handle inner double quotes.. – mribot Nov 07 '13 at 09:11

1 Answers1

1
// To search and replace content in a document part.

public static void SearchAndReplace(string document)
{
    using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(document, true))
    {
        string docText = null;
        using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
        {
            docText = sr.ReadToEnd();
        }

        Regex regexText = new Regex("Findme");
        docText = regexText.Replace(docText, new System.Xml.Linq.XText("Before *&* After").ToString()); // when we replace string with "&" we need to convert like this

        using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
        {
            sw.Write(docText);
        }
    }
}
CesarGon
  • 15,099
  • 6
  • 57
  • 85
Hari Raj
  • 11
  • 1