0

A portion of my c# .NET program contains code to modify elements within an XML document. The code works fine in terms of modifying the values based on the variables I'm setting elsewhere in the code, but the problem is that whitespace is being added to all of the the elements when the xml file is saved with the updates.

I am not accessing this element at all in my code. I am assuming that it's because of the quotation marks for the algorithm namespace value, because I can't see any other reason why this would happen. I am using Preserve Namespace on load, and Disable Formatting on save.

So question is, why is it adding this extra whitespace, and how can I stop it?

XML (Source File)

<?xml version="1.0" encoding="UTF-8"?>
<PackingList xmlns="http://www.smpte-ra.org/schemas/2067-2/2016/PKL">
<Id>urn:uuid:296a656c-3610-4de1-9b08-2aa63245788d</Id>
<AnnotationText>JOT_Sample</AnnotationText>
<IssueDate>2018-02-16T20:59:42-00:00</IssueDate>
<Issuer>Generic</Issuer>
<Creator>Generic</Creator>
<AssetList>
  <Asset>
    <Id>urn:uuid:744f36b7-fc7e-4179-8b75-c71c18f98156</Id>
    <AnnotationText>Video_744f36b7-fc7e-4179-8b75-c71c18f98156.mxf</AnnotationText>
    <Hash>8HhnKnLn+Lp/Ik9i94Ml4SXAxH4=</Hash>
    <Size>14568486</Size>
    <Type>application/mxf</Type>
    <OriginalFileName>Video_744f36b7-fc7e-4179-8b75-c71c18f98156.mxf</OriginalFileName>
    <HashAlgorithm Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
  </Asset>
  <Asset>
    <Id>urn:uuid:bf5438ea-ba58-4ae0-a64a-5d23cee2ebb3</Id>
    <AnnotationText>Audio_bf5438ea-ba58-4ae0-a64a-5d23cee2ebb3.mxf</AnnotationText>
    <Hash>Wg4aEAE5Ji9e14ZyGkvfUUjBwCw=</Hash>
    <Size>4341294</Size>
    <Type>application/mxf</Type>
    <OriginalFileName>Audio_bf5438ea-ba58-4ae0-a64a-5d23cee2ebb3.mxf</OriginalFileName>
    <HashAlgorithm Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
  </Asset>
</AssetList>
</PackingList>

XML (Output File)

<?xml version="1.0" encoding="UTF-8"?>
<PackingList xmlns="http://www.smpte-ra.org/schemas/2067-2/2016/PKL">
<Id>urn:uuid:296a656c-3610-4de1-9b08-2aa63245788d</Id>
<AnnotationText>JOT_Sample</AnnotationText>
<IssueDate>2018-02-16T20:59:42-00:00</IssueDate>
<Issuer>Generic</Issuer>
<Creator>Generic</Creator>
<AssetList>
  <Asset>
    <Id>urn:uuid:744f36b7-fc7e-4179-8b75-c71c18f98156</Id>
    <AnnotationText>Video_744f36b7-fc7e-4179-8b75-c71c18f98156.mxf</AnnotationText>
    <Hash>8HhnKnLn+Lp/Ik9i94Ml4SXAxH4=</Hash>
    <Size>14568486</Size>
    <Type>application/mxf</Type>
    <OriginalFileName>Video_744f36b7-fc7e-4179-8b75-c71c18f98156.mxf</OriginalFileName>
    <HashAlgorithm Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" />
  </Asset>
  <Asset>
    <Id>urn:uuid:bf5438ea-ba58-4ae0-a64a-5d23cee2ebb3</Id>
    <AnnotationText>Audio_bf5438ea-ba58-4ae0-a64a-5d23cee2ebb3.mxf</AnnotationText>
    <Hash>Wg4aEAE5Ji9e14ZyGkvfUUjBwCw=</Hash>
    <Size>4341294</Size>
    <Type>application/mxf</Type>
    <OriginalFileName>Audio_bf5438ea-ba58-4ae0-a64a-5d23cee2ebb3.mxf</OriginalFileName>
    <HashAlgorithm Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" />
  </Asset>
</AssetList>
</PackingList>

Code (partial):

XDocument pkldoc = XDocument.Load(packing, LoadOptions.PreserveWhitespace);

var pklns = pkldoc.Root.GetDefaultNamespace();

var pkluuid = pkldoc.Descendants(pklns + "Id").FirstOrDefault().Value; 

var pklassetElements = pkldoc.Descendants(pklns + "Asset");

foreach (var pklasset in pklassetElements)
{
    var idElement = pklasset.Descendants(pklns + "Id").First();
    if (!idElement.Value.Equals(cpluuid))
                        continue;

    SetNewValue(pklasset, pklns + "OriginalFileName", outfile);

}
void SetNewValue(XElement currentElement, XName elementName, string newValue)
{
    var matchingElements = currentElement.Descendants(elementName);

    if (matchingElements.Any())
         {
         foreach (var element in matchingElements)
                            element.SetValue(newValue);
         }
    }

pkldoc.Save(packing, SaveOptions.DisableFormatting);
FileInfo fi = new FileInfo(packing);
var pklsize = fi.Length;
jamlot
  • 45
  • 1
  • 7
  • That extra space is not significant – it doesn't change the value of the (empty) element at all. I don't think there's a way to get the serializer to change the way it formats those elements. – asherber Feb 24 '18 at 03:35
  • I hear you, but that's a tough sell to someone who's less technically savvy, who I'll need to convince to use this on hundreds of XMLs. They can download a free text editor easily compare to see that there's an unintended change from source to output. Even if it's technically "insignificant," that will be hard to sell. And either way, there's still the question of why it's happening. – jamlot Feb 24 '18 at 05:19
  • The "why" probably has a couple of answers. One is (IMO) that it looks nicer. Another has to do with an old discussion about compatibility of XHTML with older browsers; see https://stackoverflow.com/a/462997/226781 If it really bothers you or your client, you should be able to do `pkldoc.ToString().Replace(" />", "/>")` before saving. – asherber Feb 24 '18 at 14:45
  • @asherber Thanks for the info and the tip. Unfortunately that doesn't work, as it appears that the spaces are being added in the save command itself. I can strip out all code but the save, and still introduces the spaces. I realize these don't affect the validity of the XML, it just looks sloppy to me when comparing source to output. – jamlot Feb 24 '18 at 16:48
  • See my comment below. Just calling `Replace()` doesn't change the `XDocument` itself, but it will let you change the string before you save it. – asherber Feb 24 '18 at 18:19

1 Answers1

1

This works, though not very clean on my part.

string text = File.ReadAllText(packing);
text = text.Replace(" />", "/>");
File.WriteAllText(packing, text);

UPDATE

This is the solution. Thanks you @asherber !

var textToSave = pkldoc.ToString(SaveOptions.DisableFormatting).Replace(" />", "/>"); 
File.WriteAllText(packing, textToSave);
jamlot
  • 45
  • 1
  • 7
  • Sorry, my comment above was not clear enough. What I was suggesting was: `var textToSave = pkldoc.ToString(SaveOptions.DisableFormatting).Replace(" />", "/>"); File.WriteAllText(packing, textToSave);` This is more efficient than writing the file out, reading it, getting rid of the space, and writing it out again. – asherber Feb 24 '18 at 18:17
  • Aha! Yes, much cleaner. Thanks a lot for your help with this. Much appreciated! – jamlot Feb 24 '18 at 22:30
  • @asherber I just noticed that the XML declaration `` at the head is being stripped out in the file output. Stepping through the code, it seems to be happening when the modified XML is being converted to a string in this line: `var textToSave = pkldoc.ToString(SaveOptions.DisableFormatting).Replace(" />", "/>");` Even if I take out the options and reduce to `var textToSave = pkldoc.ToString();` - it's still missing the declaration. Any ideas? – jamlot Mar 03 '18 at 00:39
  • @asherber Actually it looks like this might be removed when the file is read in `XDocument.Load`, and normally gets put back into the file by XDocument when writing. Any ideas? – jamlot Mar 03 '18 at 00:48
  • The declaration is not removed; it's just stored somewhere else in the `XDocument`. Try `File.WriteAllText(packing, pkldoc.Declaration + textToSave);` – asherber Mar 03 '18 at 04:36
  • @asherber That works perfectly. I can't thank you enough. Cheers! – jamlot Mar 03 '18 at 09:20