0

As far as I know, the BinaryReader loop performance is poor. Another method I can think of is to first ReadAllBytes and then Buffer.BlockCopy into int[], but that would result in an additional copy. Is it possible to read a huge binary file directly into int[] efficiently?

  • 1
    Can you not use `ReadAllBytes` and either use bytes or use `Enumerable.Cast` to convert to int? – vc 74 Jan 18 '23 at 06:11
  • 1
    Are you saying, without actually saying, that every four bytes in the file represents a single integer value? – jmcilhinney Jan 18 '23 at 06:23
  • @jmcilhinney Good point and quite likely – vc 74 Jan 18 '23 at 06:28
  • Possible duplicate: https://stackoverflow.com/questions/1238388/faster-unsafe-binaryreader-in-net – Oliver Jan 18 '23 at 06:38
  • 2
    What is your performance profiling of the BinaryReader loop? Where did you see the bottleneck? Can it come due to slow disk I/O or a anti-virus tool in the background? Or is it just guessing, cause in .Net 3.5 it was slow? By the way, if we talk about performance, what version of .Net are you using? – Oliver Jan 18 '23 at 06:39

2 Answers2

3

You can use MemoryMarshal.AsBytes to read all data:

using var stream = new FileStream(...);
var target = new int[stream.Length / 4];
stream.Read(MemoryMarshal.AsBytes(target.AsSpan()));

No BinaryReader is used in that case. Be aware of endianness of int representation. This code above might cause problems if the file doesn't match to your hardware.

Sebastian Schumann
  • 3,204
  • 19
  • 37
  • This is perfect as long as the stream supports the Length property - and things become difficult if it does not … – 埃博拉酱 Jan 18 '23 at 08:42
  • 1
    @埃博拉酱 If the stream does not support `Length` then at some point the code will have to loop. And it won't be possible to pre-size the result buffer, which means that some data copying will be required. – Matthew Watson Jan 18 '23 at 09:28
0

If you want to read the array as another type, you can use MemoryMarshal.Cast.

using System;
using System.Runtime.InteropServices;

class Program
{
    public static void Main(string[] args)
    {
        byte[] arrayByte = { 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88 };
        Span<int> spanInt = MemoryMarshal.Cast<byte, int>(arrayByte);

        Console.WriteLine("0x{0:X8}", spanInt[0]); // For little endian it will be 0x44332211.
        Console.WriteLine("0x{0:X8}", spanInt[1]); // For little endian it will be 0x88776655.
    }
}

Another alternative is Unsafe.As. However, there are some problems, such as Length not reflecting the converted type value. I recommend using the MemoryMarshal class.

using System;
using System.Runtime.CompilerServices;

class Program
{
    public static void Main(string[] args)
    {
        byte[] arrayByte = { 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88 };
        int[] arrayInt = Unsafe.As<byte[], int[]>(ref arrayByte);

        Console.WriteLine("Length={0}", arrayInt.Length); // 8!? omg...
        Console.WriteLine("0x{0:X8}", arrayInt[0]); // For little endian it will be 0x44332211.
        Console.WriteLine("0x{0:X8}", arrayInt[1]); // For little endian it will be 0x88776655.
    }
}
radian
  • 199
  • 5
  • ...as far the endianness in the file matches those of the machine. In my old code still have a int composition made by OR-ing the bytes. – Mario Vernari Jan 18 '23 at 06:49
  • OK, Added endian comment. – radian Jan 18 '23 at 07:08
  • MemoryMarshal.Cast returns a Span, which doesn't implement IEnumerable, which often leads to me having to ToArray it again, resulting in an extra copy. Instead, after Unsafe.As, you can append a Take to correct the array length error, and it returns IEnumerable, and no copy occurs. – 埃博拉酱 Jan 18 '23 at 09:22
  • Another problem is that Unsafe.As seems not working on Android. It's simply converting the bytes into ints. – 埃博拉酱 Jan 18 '23 at 09:34