Yes, you have found a bug in the SDK.
@Chris, first of all, that code is, per the semantics of the Open XML SDK, modifying the file. When you access the contents of the part, and then go out of scope of the using statement, the contents of the part are written back into the package. This is because the presentation was opened for read/write (the second argument of the call to the Open method).
The problem is that when the contents of the part are read from the package, the space is being stripped off.
//Open the document.
using (PresentationDocument presentationDocument = PresentationDocument.Open("test.pptx", true))
{
//Just making this reference modifies the whitespace in the slide.
Slide slide = presentationDocument.PresentationPart.SlideParts.First().Slide;
var sh = slide.CommonSlideData.ShapeTree.Elements<DocumentFormat.OpenXml.Presentation.Shape>().First();
Run r = sh.TextBody.Elements<Paragraph>().First().Elements<Run>().Skip(1).FirstOrDefault();
Console.WriteLine(">{0}<", r.Text.Text);
//r.Text.Text = " ";
}
If you run the above code on the presentation, you can see that by the time you access that text element, the text of the text element is already incorrect.
If you uncomment the line that sets the text, interestingly, the slide does contain the space.
This is obviously a bug. I have reported it to the program manager at Microsoft who is responsible for the Open XML SDK.
As this scenario is important to you, I recommend that you use LINQ to XML for your code. The following code works fine:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Presentation;
using DocumentFormat.OpenXml.Drawing;
public static class PtOpenXmlExtensions
{
public static XDocument GetXDocument(this OpenXmlPart part)
{
XDocument partXDocument = part.Annotation<XDocument>();
if (partXDocument != null)
return partXDocument;
using (Stream partStream = part.GetStream())
using (XmlReader partXmlReader = XmlReader.Create(partStream))
partXDocument = XDocument.Load(partXmlReader);
part.AddAnnotation(partXDocument);
return partXDocument;
}
public static void PutXDocument(this OpenXmlPart part)
{
XDocument partXDocument = part.GetXDocument();
if (partXDocument != null)
{
using (Stream partStream = part.GetStream(FileMode.Create, FileAccess.Write))
using (XmlWriter partXmlWriter = XmlWriter.Create(partStream))
partXDocument.Save(partXmlWriter);
}
}
}
class Program
{
static void Main(string[] args)
{
using (PresentationDocument presentationDocument = PresentationDocument.Open("test.pptx", true))
{
XDocument slideXDoc = presentationDocument.PresentationPart.SlideParts.First().GetXDocument();
XNamespace p = "http://schemas.openxmlformats.org/presentationml/2006/main";
XNamespace a = "http://schemas.openxmlformats.org/drawingml/2006/main";
XElement sh = slideXDoc.Root.Element(p + "cSld").Element(p + "spTree").Elements(p + "sp").First();
XElement r = sh.Element(p + "txBody").Elements(a + "p").Elements(a + "r").Skip(1).FirstOrDefault();
Console.WriteLine(">{0}<", r.Element(a + "t").Value);
}
}
}
You could, in theory, write some generic code to dig through the LINQ to XML tree, find all elements that contain only significant white space, then traverse the Open XML SDK element tree, and set the text of those elements. That is a bit of a mess, but once done, you could use the strongly typed OM of the Open XML SDK 2.0. The values of such elements would then be correct.
One technique that makes it more easy to use LINQ to XML with Open XML is to preatomize XName objects. See http://blogs.msdn.com/b/ericwhite/archive/2008/12/15/a-more-robust-approach-for-handling-xname-objects-in-linq-to-xml.aspx
-Eric