1

Due to a PHP error in another product I sometimes get an ill formed XML response, like this:

<?xml version="1.0" encoding="UTF-8"?>
<customfields>
</customfields>Warning
Router: https://example.com/api/index.php?/Tickets/TicketCustomField/Get
file_put_contents(./__swift/cache/SWIFT_Loader.cache): failed to open stream: Invalid argument (C:/Kayako/support/__swift/library/Loader/class.SWIFT_Loader.php:1630)

Does a safe method exist to clean up this string before I deserialize it?

The claim of duplicate is correct, but the linked duplicate does not give a working solution.

The current temporary solution, only works if the starting string is valid XML and the appended error does not contain another closing tag which matches the root tag:

RegexOptions options = RegexOptions.Singleline | RegexOptions.Compiled;
var tidyStreamContents = Regex.Match(streamContents, @"^<\?xml.*?\?>\s*?<(.*?)>.*</(\1)>", options, Regex.InfiniteMatchTimeout).ToString();
Gareth A. Lloyd
  • 1,774
  • 1
  • 16
  • 26
  • 1
    You could drop everything after the last `>`, somewhat naive but it could work. Have you considered contacting the customer support of the product? – user247702 Oct 07 '14 at 15:04

1 Answers1

0

You can use CsQuery to treat invalid XML as HTML, clean it up, and output as string for further processing:

using CsQuery;

var cq = CQ.CreateFromFile("input.txt");
var sCleanXML = cq("customfields").RenderSelection;

Input (content of input.txt):

<?xml version="1.0" encoding="UTF-8"?>
<customfields>
</customfields>Warning
Router: https://example.com/api/index.php?/Tickets/TicketCustomField/Get
file_put_contents(./__swift/cache/SWIFT_Loader.cache): failed to open stream: Invalid argument (C:/Kayako/support/__swift/library/Loader/class.SWIFT_Loader.php:1630)

Output (value of sCleanXML):

<customfields> </customfields>

An alternative would be using XmlReader or HtmlAgilityPack.

Community
  • 1
  • 1
Victor Zakharov
  • 25,801
  • 18
  • 85
  • 151