0

I have many word documents(more than 10k) which have images(logo) in them. I want to replace the logo that i have in the word doc with another image. Some of these word files may not even have images in them and some may have multiple images in them. Images are not necessarily in the header section of the doc.

I have gone through some questions on this in Stackoverflow. Mainly this one

But being new to OpenXML, I'm currently not even able to replace the images inside a single word doc. The problem I'm facing is on trying to replace an image in my word doc, it seems to work fine, but there is no change in the Word Doc whatsoever. Any help would be appreciated.

This is the code I've tried so far

byte[] docBytes = File.ReadAllBytes(_myFilePath);
using (MemoryStream ms = new MemoryStream())
{
    ms.Write(docBytes, 0, docBytes.Length);

    using (WordprocessingDocument wpdoc = WordprocessingDocument.Open(ms, true))
    {
        MainDocumentPart mainPart = wpdoc.MainDocumentPart;
        Document doc = mainPart.Document;

        IEnumerable<Drawing> drawings = mainPart.Document.Descendants<Drawing>().ToList();
    foreach (Drawing drawing in drawings)
    {
        DocProperties dpr = drawing.Descendants<DocProperties>().FirstOrDefault();
        if (dpr != null && dpr.Name == "Picture 1")
        {
            foreach (DocumentFormat.OpenXml.Drawing.Blip b in drawing.Descendants<DocumentFormat.OpenXml.Drawing.Blip>().ToList())
            {
                OpenXmlPart imagePart = wordDoc.MainDocumentPart.GetPartById(b.Embed);
                using (var writer = new BinaryWriter(imagePart.GetStream()))
                {
                    writer.Write(File.ReadAllBytes(Path to my image with which to replace));
                }
            }
        }
    }
   }
}

This produces no change in the single document i'm trying it for. Also I was wondering how this can be done for docs with multiple images in the above case I opened the xml file and saw that the filename in doc properties was "Picture 1", but for word docs with multiple images, this won't be possible. Any help will be appreciated. Thanks

Hank
  • 11
  • 2
  • 1
    You're not saving the document you've created. Your document is currently in a memory stream, you have to then write the memory stream to your file system and from there you can open it. – Blue Eyed Behemoth Apr 23 '19 at 15:16

1 Answers1

2

You're not saving the document you've created. Your document is currently in a memory stream, you have to then write the memory stream to your file system and from there you can open it. You're looking for something like the following:

    byte[] docBytes = File.ReadAllBytes(_myFilePath);
    using (MemoryStream ms = new MemoryStream())
    {
        ms.Write(docBytes, 0, docBytes.Length);
        using (WordprocessingDocument wpdoc = WordprocessingDocument.Open(ms, true))
        {
            MainDocumentPart mainPart = wpdoc.MainDocumentPart;
            Document doc = mainPart.Document;
            IEnumerable<Drawing> drawings = mainPart.Document.Descendants<Drawing>().ToList();
            foreach (Drawing drawing in drawings)
            {
                DocProperties dpr = drawing.Descendants<DocProperties>().FirstOrDefault();
                if (dpr != null && dpr.Name == "Picture 1")
                {
                    foreach (DocumentFormat.OpenXml.Drawing.Blip b in drawing.Descendants<DocumentFormat.OpenXml.Drawing.Blip>().ToList())
                    {
                        OpenXmlPart imagePart = wordDoc.MainDocumentPart.GetPartById(b.Embed);
                        using (var writer = new BinaryWriter(imagePart.GetStream()))
                        {
                            writer.Write(File.ReadAllBytes(Path to my image with which to replace));
                        }
                    }
                }
            }
        }

        using (FileStream fs = new FileStream(fileName, FileMode.CreateNew, FileAccess.Write))
        {
            ms.CopyTo(fs);
        }
    }

If you think about it, you're reading in a document and storing it in your RAM. Then you manipulate it and as soon as you dispose of your memory stream, it vanishes. You have to actually write the bytes you manipulated to somewhere.

Blue Eyed Behemoth
  • 3,692
  • 1
  • 17
  • 27
  • Yes Blue Eyed Behemoth, you're correct, really silly of me to overlook this. So, following on from above question regarding word documents with multiple images, is it possible to do like a comparison between images. for eg. an Image from my local machine and all the images one by one from the word doc, such that if the match rate is above 60% or so only then proceed with replacing the image. – Hank Apr 25 '19 at 12:36
  • Well, let's think about this. If you were trying to check 100%, you can just see if the byte arrays of the images are the same. But if you're looking for a pixel comparison it will get complicated. There are algorithms/AI to check how close an image is, but I think that might get too complicated for what you're looking for. If you want to go down that route, there are projects like `Contour-Analysis-for-Image-Recognition-in-C`, but it might be faster if you just change the images regardless. – Blue Eyed Behemoth Apr 25 '19 at 13:07