I've been reading answers that explains how to get the size of a string, size in memory or size in file:
My intention is to detemine the amount of bytes that a string will occupy, in specified encoding, when written to file.
However, my function does not return the expected result when I check the size of a string for Encoding.UTF8
, Encoding.Unicode
(UTF-16) or Encoding.UTF32
.
This is what I'm doing:
''' ----------------------------------------------------------------------
''' <summary>
''' Gets the size, in bytes, of how much a string will occupy when written to a file.
''' </summary>
''' ----------------------------------------------------------------------
<DebuggerStepThrough>
<Extension>
Public Function SizeInFile(ByVal sender As String,
Optional ByVal encoding As Encoding = Nothing) As Integer
If (encoding Is Nothing) Then
encoding = System.Text.Encoding.Default
End If
Return encoding.GetByteCount(sender)
End Function
This is how I'm testing it, in the code below, the function says the string size is 2 bytes, but when written to a file the filesize is 4 bytes:
Dim str As String = "Ñ"
Console.WriteLine(String.Format("Size of String : {0}", str.SizeInFile(Encoding.Unicode)))
File.WriteAllText(".\Test.txt", str, Encoding.Unicode)
Console.WriteLine(String.Format("Size of txtfile: {0}", New FileInfo(".\Test.txt").Length))
What am I missing to perform an efficient evaluation of the string size?.
In C# or VB.NET.