My Problem: I am parsing a bunch of XML based logs (which I have little control over) into MySQL statements to switch over from an XML based database to MySQL. This bit has me stumped.
If I look at the IEnumerable<XElement>
that contains the string I'm interested in, I can see the embedded XML statement. However, if I take the value of that string, the XML statement disappears. EG:
IEnumerable (<PowerFail />
is visible):
<StepDetails>Set input voltage to 2.80V WDT should allow CPU power. CPU should detect PowerFail signal and output a<PowerFail /> tag to the serial line. WDT should reset every 1.6 seconds</StepDetails>
And taking the value, the <PowerFail />
tag is missing from the string:
Set input voltage to 2.80V WDT should allow CPU power. CPU should detect PowerFail signal and output a tag to the serial line. WDT should reset every 1.6 seconds
I get the same thing if I do a .ToString()
Procedure:
If you paste the following into LinqPad as C# Statements, you can see what I mean. The XML tag <PowerFail />
disappears. I noticed it also disappears in here unless I place back ticks around it. I've included the LinqPad tag because that's how I'm parsing these files (there are tens of thousands of log files going back years) using a series of LinqPad scripts to process the logs into MySQL and insert them to create the new database.
My Question: I realize I can get the string out with some regex or substring or something, but it seems like I should be able to get the whole string, tags & all from the IEnumerable, but how to do so? Also, I'm curious to know why the tag is swallowed just for my edification.
I have roughly three dozen variants of these types of log anomalies affecting the tens of thousands of logs (last one I fixed yesterday applied to 1500+ logs alone) across seven years or so of data, so I'd like to find a (more) generic solution instead of an XML tag specific regex, substring or whatever for each of them. I can't change the logs, and I don't want to lose data while transferring to the new database.
To View the Problem Firsthand: Cut & Paste into LinqPAD as C# Statements (is there an online way to do this similar to JSFiddle for JavaScript)? I've added a regex solution to the bottom in case someone comes looking for something like that, but I'm still interested in a better way to do it.
string xml = @"<StepResults>
<TestStep Name='2.8V OPERATION' Result='Pass'>
<OperatorComment/>
<StepDetails>Set input voltage to 2.80V WDT should allow CPU power. CPU should detect PowerFail signal and output a<PowerFail/> tag to the serial line. WDT should reset every 1.6 seconds</StepDetails>
<Measurements NumberOfMeasurements='1'>
<Measurement Name='BATTERY VOLTAGE: VOLTS'>
<MeasuredValue>2.794608</MeasuredValue>
<Min>2.785000</Min>
<Max>2.800000</Max>
</Measurement>
</Measurements>
</TestStep>
</StepResults>";
var xd = XDocument.Parse(xml);
Console.WriteLine(xd);
var xe =
from e in xd.Descendants("StepDetails")
select e;
Console.WriteLine(xe);
Console.WriteLine(xe.First().Value);
//new code below to show a working regex solution:
string stepDetail = xe.First().ToString();
Regex matchFrontTag = new Regex("^<[^>]*>");
Regex matchRearTag = new Regex("<[^>]*>$");
stepDetail = matchFrontTag.Replace(stepDetail,string.Empty);
stepDetail = matchRearTag.Replace(stepDetail,string.Empty);
Console.WriteLine(stepDetail);