I need to count the size, in bytes, that a substring will be once converted into a UTF8 byte array. This needs to happen without actually doing the conversion of that substring. The string I'm working with is very large, unfortunately, and I've got to be careful not to create another large string (or byte array) in memory.
There's a method on the Encoding.UTF8 object called GetByteCount, but I'm not seeing an overload that does it where I don't have to copy the string into a byte array. This doesn't work for me:
Encoding.UTF8.GetByteCount(stringToCount.ToCharArray(), startIndex, count);
because stringToCount.ToCharArray() will create a copy of my string.
Here's what I have right now:
public static int CalculateTotalBytesForUTF8Conversion(string stringToCount, int startIndex, int endIndex)
{
var totalBytes = 0;
for (int i = startIndex ; i < endIndex; i++)
totalBytes += Encoding.UTF8.GetByteCount(new char[] { stringToCount[i] });
return totalBytes;
}
The GetByteCount method doesn't appear to have the ability to take in just a char, so this was the compromise I'm at.
Is this the right way to determine the byte count of a substring, after conversion to UTF8, without actually doing that conversion? Or is there a better method to do this?