6

As the title states I am trying to merge multiple word(.docx) files into one word doc. Each of these documents is one page long. I am using some of the code from this post in this implementation. The issue I am running into is that only the first document gets written properly, every other iteration appends a new document but the document contents is the same as the first.

Here is the code I am using:

//list that holds the file paths
List<String> fileNames = new List<string>();
fileNames.Add("filePath");
fileNames.Add("filePath");
fileNames.Add("filePath");
fileNames.Add("filePath");
fileNames.Add("filePath");

//get the first document
MemoryStream mainStream = new MemoryStream();
byte[] buffer = File.ReadAllBytes(fileNames[0]);
mainStream.Write(buffer, 0, buffer.Length);

using (WordprocessingDocument mainDocument = WordprocessingDocument.Open(mainStream, true))
{
    //xml for the new document
    XElement newBody = XElement.Parse(mainDocument.MainDocumentPart.Document.Body.OuterXml);
    //iterate through eacah file
    for (int i = 1; i < fileNames.Count; i++)
    {
        //read in the document
        byte[] tempBuffer = File.ReadAllBytes(fileNames[i]);
        WordprocessingDocument tempDocument = WordprocessingDocument.Open(new MemoryStream(tempBuffer), true);
        //new documents XML
        XElement tempBody = XElement.Parse(tempDocument.MainDocumentPart.Document.Body.OuterXml);
        //add the new xml
        newBody.Add(tempBody);
        string str = newBody.ToString();
        //write to the main document and save
        mainDocument.MainDocumentPart.Document.Body = new Body(newBody.ToString());
        mainDocument.MainDocumentPart.Document.Save();
        mainDocument.Package.Flush();
        tempBuffer = null;
    }
    //write entire stream to new file
    FileStream fileStream = new FileStream("xmltest.docx", FileMode.Create);
    mainStream.WriteTo(fileStream);
    //ret = mainStream.ToArray();
    mainStream.Close();
    mainStream.Dispose();
}

Again the problem is that each new document being created has the same content as the first document. So when I run this the output will be a document with five identical pages. I've tried switching the documents order around in the list and get the same result so it is nothing specific to one document. Could anyone suggest what I am doing wrong here? I'm looking through it and I can't explain the behavior I am seeing. Any suggestions would be appreciated. Thanks much!

Edit: I'm thinking this may have something to do with that fact that the documents I am trying to merge have been generated with custom XML parts. I'm thinking that the Xpath in the documents are somehow pointing to the same content. The thing is I can open each of these documents and see the proper content, it's just when I merge them that I see the issue.

Community
  • 1
  • 1
TheMethod
  • 2,893
  • 9
  • 41
  • 72
  • 1
    What does the `document.xml` look like? Another possibility are matching id's. – user7116 Jul 23 '12 at 17:58
  • The document.xml confirms that the xpath for each of the pages is bound to the same thing. For example w:xpath="/project[1]/ProjectDescription[1] . While this works fine for each document on it's own when they are merged they are all using the same source. I'm not sure what my options are at this point. I need a way to have each document populated with their content before I merge them. – TheMethod Jul 23 '12 at 18:26
  • the `linq` tag isn't appropriate here. – Marty Neal Jul 23 '12 at 21:36
  • XElement is part of the System.Xml.Linq namespace so I choose to include it. – TheMethod Jul 24 '12 at 11:33
  • As another option, our MergeDocx product can merge documents which contain custom xml data bindings. – JasonPlutext Jul 05 '13 at 03:10

2 Answers2

5

This solution uses DocumentFormat.OpenXml

public static void Join(params string[] filepaths)
    {

     //filepaths = new[] { "D:\\one.docx", "D:\\two.docx", "D:\\three.docx", "D:\\four.docx", "D:\\five.docx" };
        if (filepaths != null && filepaths.Length > 1)

            using (WordprocessingDocument myDoc = WordprocessingDocument.Open(@filepaths[0], true))
            {
                MainDocumentPart mainPart = myDoc.MainDocumentPart;

                for (int i = 1; i < filepaths.Length; i++)
                {
                    string altChunkId = "AltChunkId" + i;
                    AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(
                        AlternativeFormatImportPartType.WordprocessingML, altChunkId);
                    using (FileStream fileStream = File.Open(@filepaths[i], FileMode.Open))
                    {
                        chunk.FeedData(fileStream);
                    }
                    DocumentFormat.OpenXml.Wordprocessing.AltChunk altChunk = new DocumentFormat.OpenXml.Wordprocessing.AltChunk();
                    altChunk.Id = altChunkId;
                    //new page, if you like it...
                        mainPart.Document.Body.AppendChild(new Paragraph(new Run(new Break() { Type = BreakValues.Page })));
                    //next document
                    mainPart.Document.Body.InsertAfter(altChunk, mainPart.Document.Body.Elements<Paragraph>().Last());
                }
                mainPart.Document.Save();
                myDoc.Close();
            }
    }
Emanuele Greco
  • 12,551
  • 7
  • 51
  • 70
  • Thanks you helped me a lot. i just needed to change 1 line `mainPart.Document.Body.InsertAfter(altChunk, mainPart.Document.Body.Elements().Last());` To use OfType with LinQ `mainPart.Document.Body.InsertAfter(altChunk, mainPart.Document.Body.OfType().Last());` – seergius96 May 05 '23 at 09:58
3

The way you seem to merge may not work properly at times. You can try one of the approaches

  1. Using AltChunk as in http://blogs.msdn.com/b/ericwhite/archive/2008/10/27/how-to-use-altchunk-for-document-assembly.aspx

  2. Using http://powertools.codeplex.com/ DocumentBuilder.BuildDocument method

    If still you face the similar issue you can find the databound controls prior to Merge and assign data to these controls from the CustomXml part. You can find this approach in method AssignContentFromCustomXmlPartForDataboundControl of OpenXmlHelper class. The code can be downloaded from http://worddocgenerator.codeplex.com/

Community
  • 1
  • 1
Atul Verma
  • 2,012
  • 15
  • 8