3

When I urlEncode a string (namely a xml file) in some ocassions it adds %00 character at the end of the file. I'd like to know why it happens this and if it can be prevented (i can always erase the %00 characters). The xml file was created using xmlwriter. Weird thing is I use the same code to create other xml files and after encoding them it doesn't add %00 characters.

Example:

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE peticion >
<peticion>
    <nombre>Info hotel</nombre>
    <agencia>HOTUSA</agencia>
    <tipo>15</tipo>
</peticion>

Edit: to create the xml this is what I do.

Dim xmlWriterSettings As New System.Xml.XmlWriterSettings
        With xmlWriterSettings
            .Encoding = Encoding.GetEncoding("iso-8859-1")
            .OmitXmlDeclaration = False
            .Indent = True
        End With

        Dim ms As New IO.MemoryStream

        Using writer As System.Xml.XmlWriter = System.Xml.XmlWriter.Create(ms, xmlWriterSettings)
            With writer
                .WriteDocType("peticion", Nothing, Nothing, Nothing)
                .WriteStartElement("peticion")
                .WriteElementString("nombre", "Info hotel")
                .WriteElementString("agencia", "HOTUSA")
                .WriteElementString("tipo", "15")
                .WriteEndElement()
            End With
        End Using

        Dim xml As String = Encoding.GetEncoding("iso-8859-1").GetString(ms.GetBuffer)

Dim XmlEncoded As String = HttpUtility.UrlEncode(xml)

XmlEncoded contains:

%3c%3fxml+version%3d%221.0%22+encoding%3d%22iso-8859-1%22%3f%3e%0d%0a%3c!DOCTYPE+peticion+%3e%0d%
0a%3cpeticion%3e%0d%0a++%3cnombre%3eInfo+hotel%3c%2fnombre%3e%0d%0a++%3cagencia%3eHOTUSA%3c%
2fagencia%3e%0d%0a++%3ctipo%3e15%3c%2ftipo%3e%0d%0a%3c%2fpeticion%3e%00%00%00%00%00%00%00%00%00%
00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%
00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%
00%00%00%00%00%00%00%00%00%00%00%00%00%00

Where all these %00 come from?

ShengLong
  • 179
  • 1
  • 2
  • 9
  • Please show the declaration of the variable `xml` and the code for creating its content. – Codo Jul 26 '12 at 15:04

2 Answers2

4

The remarks on MemoryStream.GetBuffer provide the appropriate guidance:

Note that the buffer contains allocated bytes which might be unused. For example, if the string "test" is written into the MemoryStream object, the length of the buffer returned from GetBuffer is 256, not 4, with 252 bytes unused. To obtain only the data in the buffer, use the ToArray method; however, ToArray creates a copy of the data in memory.

Modify your code like so:

Dim xml As String = Encoding.GetEncoding("iso-8859-1").GetString(ms.ToArray)

In fact, a better option in this case would be to use a StringBuilder:

Dim sb As New StringBuilder
Using writer As XmlWriter = XmlWriter.Create(sb, xmlWriterSettings)
    ' ...
End Using        

Dim xml as String = sb.ToString()
user7116
  • 63,008
  • 17
  • 141
  • 172
1

I believe that ms.GetBuffer contains more than you think. %00 represents a NULL and my guess is that the buffer contains filler NULLs at the end.

Instead do:

Using ms As New IO.MemoryStream
    Dim writer As System.Xml.XmlWriter = System.Xml.XmlWriter.Create(ms, xmlWriterSettings)

    With writer
        .WriteDocType("peticion", Nothing, Nothing, Nothing)
        .WriteStartElement("peticion")
        .WriteElementString("nombre", "Info hotel")
        .WriteElementString("agencia", "HOTUSA")
        .WriteElementString("tipo", "15")
        .WriteEndElement()
    End With

    ms.Position = 0
    Dim xml As String = ms.ReadToEnd()
    Dim XmlEncoded As String = HttpUtility.UrlEncode(xml)
End Using

See this question for more info on getting a string from a MemoryStream.

See this documentation detailing the fact that the buffer contains allocated bytes which might be unused.

Community
  • 1
  • 1
Sumo
  • 4,066
  • 23
  • 40
  • I think @sumo is right. The documentation for MemoryStream explains that MemoryStreams are not necessarily resizable, so there could be filler in there: "Memory streams created with an unsigned byte array provide a non-resizable stream of the data. When using a byte array, you can neither append to nor shrink the stream, although you might be able to modify the existing contents depending on the parameters passed into the constructor. Empty memory streams are resizable, and can be written to and read from." – Dave Cameron Jul 26 '12 at 15:54
  • Thanks for your answer too, Sumo. You indicated first that there was something strange in ms.GetBuffer. – ShengLong Jul 27 '12 at 14:52