2

I have a stream whose next N bytes are a UTF8 encoded string. I want to create that string with the least overhead.

This works:

var bytes = new byte[n];
stream.Read(bytes, 0, n); // my actual code checks return value
var str = Encoding.UTF8.GetString(bytes);

In my benchmarking I see considerable time spent collecting garbage in the form of byte[] temporaries. If I can get rid of these, I can effectively halve my heap allocations.

The UTF8Encoding class doesn't have methods for working with streams.

I can use unsafe code, if that helps. I cannot reuse a byte[] buffer without ThreadLocal<byte[]> which seems to introduce more overhead than it alleviates. I do need to support UTF8 (ASCII won't cut it).

Is there an API or technique here that I'm missing?

Drew Noakes
  • 300,895
  • 165
  • 679
  • 742
  • Note to self: http://bjoern.hoehrmann.de/utf-8/decoder/dfa/ – Drew Noakes Dec 27 '15 at 06:59
  • You could subclass `Stream` to make a `TruncatedProxyStream` that wraps the original stream and reads at most `n` bytes from the underlying stream. Then pass that to a `StreamReader`. – dbc Dec 27 '15 at 07:49
  • How big are these `byte[]` temporaries? Is the problem that you are allocating lots of tiny arrays, or a few large arrays that end up on the [large object heap](http://stackoverflow.com/questions/8951836/why-large-object-heap-and-why-do-we-care)? – dbc Dec 27 '15 at 08:53
  • @dbc, what would it read them into? Ideally I was an implementation that writes directly into a string's backing data char[]. – Drew Noakes Dec 27 '15 at 10:07
  • @dbc I'm trying to maximise throughput. Many of the strings will be relatively short, though there is no upper bound. – Drew Noakes Dec 27 '15 at 10:08
  • Both Encoding.UTF8.GetString as a ReadBuffer in StreamReader with an encoding pass through this method [GetChars](http://referencesource.microsoft.com/#mscorlib/system/text/encoding.cs,1271) which is probably the source of the allocations you see. – rene Dec 27 '15 at 15:08

2 Answers2

5

You can't avoid allocation of byte[] if you use the UTF8 encoding which is variable length. So the length of the resulting string can be determined only after reading all of these bytes.

Let's see the UTF8Encoding.GetString method:

public override unsafe String GetString(byte[] bytes, int index, int count)
{
    // Avoid problems with empty input buffer
    if (bytes.Length == 0) return String.Empty;

    fixed (byte* pBytes = bytes)
        return String.CreateStringFromEncoding(
            pBytes + index, count, this);
}

It calls the String.CreateStringFromEncoding method which gets the resulting string length first, then allocates it and fills it with characters without additional allocations. The UTF8Encoding.GetChars allocates nothing too.

unsafe static internal String CreateStringFromEncoding(
    byte* bytes, int byteLength, Encoding encoding)
{
    int stringLength = encoding.GetCharCount(bytes, byteLength, null);

    if (stringLength == 0)
        return String.Empty;

    String s = FastAllocateString(stringLength);
    fixed (char* pTempChars = &s.m_firstChar)
    {
        encoding.GetChars(bytes, byteLength, pTempChars, stringLength, null);
    }
}

If you will use a fixed length encoding, then you can allocate a string directly and use Encoding.GetChars on it. But you will lose performance on calling Stream.ReadByte multiple times since there's no Stream.Read that accepts byte* as an argument.

const int bufferSize = 256;

string str = new string('\0', n / bytesPerCharacter);
byte* bytes = stackalloc byte[bufferSize];

fixed (char* pinnedChars = str)
{
    char* chars = pinnedChars;

    for (int i = n; i >= 0; i -= bufferSize)
    {
        int byteCount = Math.Min(bufferSize, i);
        int charCount = byteCount / bytesPerCharacter;

        for (int j = 0; j < byteCount; ++j)
            bytes[j] = (byte)stream.ReadByte();

        encoding.GetChars(bytes, byteCount, chars, charCount);

        chars += charCount;
    }
}

So you already use the better way to get strings. The only thing that could be done in this situation is implementing the ByteArrayCache class. It should be similar to StringBuilderCache.

public static class ByteArrayCache
{
    [ThreadStatic]
    private static byte[] cachedInstance;

    private const int maxArraySize = 1024;

    public static byte[] Acquire(int size)
    {
        if (size <= maxArraySize)
        {
            byte[] instance = cachedInstance;

            if (cachedInstance != null && cachedInstance.Length >= size)
            {
                cachedInstance = null;
                return instance;
            }
        }

        return new byte[size];
    }

    public static void Release(byte[] array)
    {
        if ((array != null && array.Length <= maxArraySize) &&
            (cachedInstance == null || cachedInstance.Length < array.Length))
        {
            cachedInstance = array;
        }
    }
}

Usage:

var bytes = ByteArrayCache.Acquire(n);
stream.Read(bytes, 0, n);

var str = Encoding.UTF8.GetString(bytes);
ByteArrayCache.Release(bytes);
Siyual
  • 16,415
  • 8
  • 44
  • 58
Yoh Deadfall
  • 2,711
  • 7
  • 28
  • 32
  • 1
    Just to close the loop, there's no unsafe `Stream.Read(byte *buffer, int offset, int count)` method on `Stream` that would allow n bytes to be read from the stream into an unsafe array. If there were, then copying the logic of `CreateStringFromEncoding()` might be performant. – dbc Dec 27 '15 at 18:58
  • The other problem with calling `ReadByte` this way is that EOS (-1) is not detected. It seems the limitation here is that there's no way of getting multiple bytes from a stream without using an array. I did try reusing a thread local array but found it hurt performance. Not completely sure why though, so I'll try a bit more. – Drew Noakes Dec 28 '15 at 02:48
0

For those who don't want to implement their own array re-use logic and don't want to deal with unsafe code either there's the ArrayPool<T> class available for .NET Core, .NET 5+, .NET Standard 2.1+ and the Span<T> struct.

Using ArrayPool<T>

As the name suggests it allows you to reuse arrays therefore reducing the GC overhead.

Your code would look something like this:

// rent an existing byte array instead of creating a new one
var bytes = ArrayPool<byte>.Shared.Rent(n); 

// do your thing ...
stream.Read(bytes, 0, n);
var str = Encoding.UTF8.GetString(bytes);

// return the rented array so it can be reused. 
//Optionally you can tell the array pool class to clear it too if you want an empty array in the next reuse-cycle.
ArrayPool<byte>.Shared.Return(buffer);

Using Span<T>

If you are certain that your stream length n will never get too big you could even use stackalloc and Span<T> making your code even faster as the GC isn't involved at all (stack memory is cheap).

// Create your buffer.
Span<byte> bytes = stackalloc byte[n];

// do your thing ...
stream.Read(bytes);
var str = Encoding.UTF8.GetString(bytes);

// don't need to free or GC collect anything. Your buffer will just be popped off the stack once the method returns.

Again be careful you don't overflow your stack with huge values of n. See this question about the Stack capacity in c#.

Frederik Hoeft
  • 1,177
  • 1
  • 13
  • 37