1

I have issues writing a XML parser/merger, there is this XML that uses UTF8-BOM encoding and I get an error while using XElement.Parse. However if I convert that file to UTF8 without BOM it solves the issue.

The error is "- {"Data at the root level is invalid. Line 1, position 1."}".

This XML is downloaded from a SOAP API to a Byte stream and then converted to a string like this:

Dim sourceFile_as_Byte = SOAPAPI.Download 'I download the file using a SOAP API method.
Dim ByteEncoder As System.Text.Encoding = System.Text.Encoding.UTF8
SourceFile_as_string = ByteEncoder.GetString(SourceFile_as_Byte)
Dim XMLdoc As XElement
XMLdoc = XElement.Parse(SourceFile_as_string)

I've found other solutions like using XElement.Load which works regardless (seems .Load manages the encoding?) but due to the nature of the solution I need to use the Parse method so that's why I'm trying to convert that string and remove the BOM.

Thanks

daviddgz
  • 25
  • 1
  • 8
  • You could use the [UTF8Encoding.GetString(Byte(), Int32, Int32) Method](https://learn.microsoft.com/en-us/dotnet/api/system.text.utf8encoding.getstring): `Dim sourceFileAsString = byteEncoder.GetString(sourceFileAsByte, 3, sourceFileAsByte.Length - 3)`. – Andrew Morton Apr 02 '21 at 17:55

1 Answers1

0

Encoding.UTF8.GetString() / GetChars() are known to also include the BOM in the string it creates from the byte array passed to the methods (you can see that the string is 1 char longer when the source file is saved with the BOM).
You can remove the char, in case it's there, using the TrimStart() method.
The Unicode char is \uFEFF or ChrW(&HFEFF).

Dim sourceBytes = SOAPAPI.Download
Dim xml = Encoding.UTF8.GetString(sourceBytes).TrimStart(ChrW(&HFEFF))
Dim xmlDoc = XElement.Parse(xml)
Jimi
  • 29,621
  • 8
  • 43
  • 61