1

I have the following code for a C# console app. It parses a Word document for textboxes and inserts the same text into the document at the textbox anchor point with markup. This is so I can convert to Markdown using pandoc, including textbox content which is not available due to https://github.com/jgm/pandoc/issues/3086. I can then replace my custom markup with markdown after conversion.

The console app is called in a PowerShell loop for all documents in a target list.

When I first run the Powershell script, all documents are opened and saved (with a new name) without error. But the next time I run it, I get an occasional popup error:

The last time you opened '' it caused a serious error. Do you still want to open it?

I can get through this by selecting yes on every popup, but this requires intervention and is tedious and slow. I want to know why this code results in this problem?

string path = args[0];

Console.WriteLine($"Parsing {path}");

Application word = new Application();
Document doc = word.Documents.Open(path);

try
{

    foreach (Shape shp in doc.Shapes)
    {
        if (shp.TextFrame.HasText != 0)
        {
            string text = shp.TextFrame.TextRange.Text;
            int page = shp.Anchor.Information[WdInformation.wdActiveEndPageNumber];
            string summary = Regex.Replace(text, @"\r\n?|\n", " ");
            Console.WriteLine($"++++textbox++++ Page {page}: {summary.Substring(0, Math.Min(summary.Length, 40))}");

            string newtext = @$"{Environment.NewLine}TEXTBOX START%%%{text}%%%TEXTBOX END{Environment.NewLine}";

            var range = shp.Anchor;
            range.InsertBefore(Environment.NewLine);
            range.Collapse();
            range.Text = newtext;
            range.set_Style(WdBuiltinStyle.wdStyleNormal);
        }
    }

    string newFile = Path.GetFullPath(path) + ".notb.docx";
    doc.SaveAs2(newFile);
}
finally
{
    doc.Close();
    word.Quit();
}
Eugene Astafiev
  • 47,483
  • 3
  • 24
  • 45
DLT
  • 139
  • 8

1 Answers1

0

The console app is called in a PowerShell loop for all documents in a target list.

You can automate Word from your PowerShell script directly without involving any other dependencies. At least that will allow you to keep a single Word instance without creating each time a new Word Application instance for each document:

Application word = new Application();
Document doc = word.Documents.Open(path);

In the loop you could just open documents for processing and then closing them. It should improve the overall performance of your solution.

When you are done processing a document you need to close it by using the Close method which closes the specified document.

Also when a new Word Application instance is created, don't forget to close it as well by calling the Quit method which quits Microsoft Word and optionally saves or routes the open documents.

Application.Quit SaveChanges:=wdSaveChanges, OriginalFormat:=wdWordDocument
Eugene Astafiev
  • 47,483
  • 3
  • 24
  • 45