39

I'm trying to improve my understanding of the STFS file format by using a program to read all the different bits of information. Using a website with a reference of which offsets contain what information, I wrote some code that has a binary reader go through the file and place the values in the correct variables.

The problem is that all the data is SUPPOSED to be Big Endian, and everything the binary reader read is Little Endian. So, what's the best way to go about fixing this?

Can I create a mimic class of Binary reader that returns a reversed array of bytes? Is there something I can change in class instance that will make it read in big endian so I don't have to rewrite everything?

Any help is appreciated.

edit: I tried adding Encoding.BigEndianUnicode as a parameter, but it still reads little endian.

mowwwalker
  • 16,634
  • 25
  • 104
  • 157
  • @HansPassant, Would this be one of those dlls that require me to make my code open source? Why do some dlls require that? – mowwwalker Dec 23 '11 at 21:53
  • Walkerneo I deleted my answer because zmbq answered essentially the same thing 3 minutes before me. The concept of endianness does not apply to byte arrays, only to words, dwords, qwords, etc., that is to groups of 2, 4, 8 and so on bytes. I am sorry if it would mean changing a lot of code, but a man has to do what a man has to do. – Mike Nakis Dec 23 '11 at 21:57
  • Skeet sells books, the code has few strings attached. Check the license section on that page. Apache terms are here: http://www.apache.org/licenses/LICENSE-2.0.html – Hans Passant Dec 23 '11 at 21:58
  • If what you are concerned about is extracting words, dwords, qwords etc. AND converting them to the proper endianness in one step, then this question has been answered elsewhere: http://stackoverflow.com/questions/1674160/converting-little-endian-to-int – Mike Nakis Dec 23 '11 at 22:00
  • @MikeNakis, Oh yeah, you're right about the byte arrays. I'm still learning :D – mowwwalker Dec 23 '11 at 22:03
  • @HansPassant, His Binary Reader doesn't have all the methods of the system's binary reader.. – mowwwalker Dec 23 '11 at 22:20
  • @HansPassant, Thanks, but I answered my own question :D – mowwwalker Dec 23 '11 at 22:42
  • Well, there you go. Good programmers spin miracles in 21 minutes or less :) – Hans Passant Dec 23 '11 at 22:51

7 Answers7

42

I'm not usually one to answer my own questions, but I've accomplished exactly what I wanted with some simple code:

class BinaryReader2 : BinaryReader { 
    public BinaryReader2(System.IO.Stream stream)  : base(stream) { }

    public override int ReadInt32()
    {
        var data = base.ReadBytes(4);
        Array.Reverse(data);
        return BitConverter.ToInt32(data, 0);
    }

    public Int16 ReadInt16()
    {
        var data = base.ReadBytes(2);
        Array.Reverse(data);
        return BitConverter.ToInt16(data, 0);
    }

    public Int64 ReadInt64()
    {
        var data = base.ReadBytes(8);
        Array.Reverse(data);
        return BitConverter.ToInt64(data, 0);
    }

    public UInt32 ReadUInt32()
    {
        var data = base.ReadBytes(4);
        Array.Reverse(data);
        return BitConverter.ToUInt32(data, 0);
    }

}

I knew that's what I wanted, but I didn't know how to write it. I found this page and it helped: http://www.codekeep.net/snippets/870c4ab3-419b-4dd2-a950-6d45beaf1295.aspx

Snicker
  • 957
  • 10
  • 16
mowwwalker
  • 16,634
  • 25
  • 104
  • 157
  • 12
    Off-topic, but your class's fields (`a16` etc) are unnecessary. You assign an array to them during construction, but within each method you replace that array with a new array returned by the `Read` function. You could just put `var a32 = base.ReadBytes...` in each method and get rid of the fields. – Daniel Earwicker Nov 14 '12 at 12:10
  • 16
    They're not unnecessary, they're harmful. Turning what is (potentially, ignoring the share underlying stream) a thread-safe code into a shared state situation. – skolima Mar 05 '14 at 15:45
  • 9
    You probably want to check `BitConverter.IsLittleEndian` before reversing. If it is `false` you don't need to reverse. – João Portela Mar 28 '14 at 18:41
  • 1
    @JoãoPortela that depends upon the expect endianess of the source data! :D – Gusdor Apr 23 '14 at 15:28
  • 4
    @Gusdor Yes. You have to know the endianess of the source data and the endianess of BitConverter, if they don't match: reverse it. – João Portela Apr 24 '14 at 08:56
  • 5
    Allocating an array for each read creates work for the GC. You might just call `GetByte` as many times as required, then shift/OR the bytes into place. – Drew Noakes Nov 09 '16 at 23:18
18

IMHO a slightly better answer as it doesn't require a different class to be newed-up, makes the big-endian calls obvious and allows big- and little-endian calls to be mixed in the stream.

public static class Helpers
{
  // Note this MODIFIES THE GIVEN ARRAY then returns a reference to the modified array.
  public static byte[] Reverse(this byte[] b)
  {
    Array.Reverse(b);
    return b;
  }

  public static UInt16 ReadUInt16BE(this BinaryReader binRdr)
  {
    return BitConverter.ToUInt16(binRdr.ReadBytesRequired(sizeof(UInt16)).Reverse(), 0);
  }

  public static Int16 ReadInt16BE(this BinaryReader binRdr)
  {
    return BitConverter.ToInt16(binRdr.ReadBytesRequired(sizeof(Int16)).Reverse(), 0);
  }

  public static UInt32 ReadUInt32BE(this BinaryReader binRdr)
  {
    return BitConverter.ToUInt32(binRdr.ReadBytesRequired(sizeof(UInt32)).Reverse(), 0);
  }

  public static Int32 ReadInt32BE(this BinaryReader binRdr)
  {
    return BitConverter.ToInt32(binRdr.ReadBytesRequired(sizeof(Int32)).Reverse(), 0);
  }

  public static byte[] ReadBytesRequired(this BinaryReader binRdr, int byteCount)
  {
    var result = binRdr.ReadBytes(byteCount);

    if (result.Length != byteCount)
      throw new EndOfStreamException(string.Format("{0} bytes required from stream, but only {1} returned.", byteCount, result.Length));

    return result;
  }
}
Tim Williams
  • 197
  • 1
  • 2
  • 13
    Remember to check `BitConverter.IsLittleEndian` before reversing. – João Portela Mar 28 '14 at 18:41
  • looks like you need ".ToArray()" after the Reverse, since Reverse returns IEnumerable and not byte[] (which is what the BitConverter expects) – Joezer Dec 25 '18 at 15:28
  • 3
    Since .NET Core, there is also a BinaryPrimitives class making this obsolete: https://learn.microsoft.com/en-us/dotnet/api/system.buffers.binary.binaryprimitives – nikeee Mar 24 '20 at 20:00
  • @JoãoPortela Actually, that's not necessary. `BinaryReader` *always* reads in little-endian, as you can see in the [source code](https://github.com/dotnet/runtime/blob/6d2340542176acce320a932084febaed4fd9da5b/src/libraries/System.Private.CoreLib/src/System/IO/BinaryReader.cs#L232) – it actually just calls BinaryPrimitives.Read*Whatever*LittleEndian. Kind of wish they'd include big-endian versions by default… seems like it would be pretty easy. – Morgan Harris May 11 '22 at 00:30
  • A much shorter and probably faster version of this code would just use `BinaryPrimitives.ReverseEndianness`. – Morgan Harris May 11 '22 at 00:33
  • @MorganHarris yes. But that check is for the bitconverter. If this code is run in an arquitecture with different endianess, the assumptions of how BitConverter works would no longer hold. – João Portela May 12 '22 at 17:11
  • 1
    @JoãoPortela you know what, I didn't even see the `BitConverter` there ‍♂️ Yeah, just don't use it IMO. `BinaryPrimitives` is the correct tool here. – Morgan Harris May 13 '22 at 04:33
  • Like @nikeee mentioned in a previous comment, we now have BinaryPrimitives for this. This is a very old answer and it wasn't available at the time. – João Portela May 17 '22 at 11:36
8

A mostly-complete (for my purposes) drop-in replacement for BinaryReader that handles endianness correctly, unlike most of these answers. By default it works exactly like BinaryReader, but can be constructed to read in the required endianness. Additionally the Read<Primitive> methods are overloaded to allow you to specify the endianness to read a particular value in - useful in the (unlikely) scenario that you're dealing with a stream of mixed LE/BE data.

public class EndiannessAwareBinaryReader : BinaryReader
{
    public enum Endianness
    {
        Little,
        Big,
    }

    private readonly Endianness _endianness = Endianness.Little;

    public EndiannessAwareBinaryReader(Stream input) : base(input)
    {
    }

    public EndiannessAwareBinaryReader(Stream input, Encoding encoding) : base(input, encoding)
    {
    }

    public EndiannessAwareBinaryReader(Stream input, Encoding encoding, bool leaveOpen) : base(input, encoding, leaveOpen)
    {
    }

    public EndiannessAwareBinaryReader(Stream input, Endianness endianness) : base(input)
    {
        _endianness = endianness;
    }

    public EndiannessAwareBinaryReader(Stream input, Encoding encoding, Endianness endianness) : base(input, encoding)
    {
        _endianness = endianness;
    }

    public EndiannessAwareBinaryReader(Stream input, Encoding encoding, bool leaveOpen, Endianness endianness) : base(input, encoding, leaveOpen)
    {
        _endianness = endianness;
    }

    public override short ReadInt16() => ReadInt16(_endianness);

    public override int ReadInt32() => ReadInt32(_endianness);

    public override long ReadInt64() => ReadInt64(_endianness);

    public override ushort ReadUInt16() => ReadUInt16(_endianness);

    public override uint ReadUInt32() => ReadUInt32(_endianness);

    public override ulong ReadUInt64() => ReadUInt64(_endianness);

    public short ReadInt16(Endianness endianness) => BitConverter.ToInt16(ReadForEndianness(sizeof(short), endianness));

    public int ReadInt32(Endianness endianness) => BitConverter.ToInt32(ReadForEndianness(sizeof(int), endianness));

    public long ReadInt64(Endianness endianness) => BitConverter.ToInt64(ReadForEndianness(sizeof(long), endianness));

    public ushort ReadUInt16(Endianness endianness) => BitConverter.ToUInt16(ReadForEndianness(sizeof(ushort), endianness));

    public uint ReadUInt32(Endianness endianness) => BitConverter.ToUInt32(ReadForEndianness(sizeof(uint), endianness));

    public ulong ReadUInt64(Endianness endianness) => BitConverter.ToUInt64(ReadForEndianness(sizeof(ulong), endianness));

    private byte[] ReadForEndianness(int bytesToRead, Endianness endianness)
    {
        var bytesRead = ReadBytes(bytesToRead);

        if ((endianness == Endianness.Little && !BitConverter.IsLittleEndian)
            || (endianness == Endianness.Big && BitConverter.IsLittleEndian))
        {
            Array.Reverse(bytesRead);
        }

        return bytesRead;
    }
}
Ian Kemp
  • 28,293
  • 19
  • 112
  • 138
  • 1
    Best solution, handles host system endianness as well as source data endianness and only reverses data when it has to be reversed. – Thomas Hilbert May 19 '20 at 22:06
  • Awesome solution, however the BitConverter methods needed an extra parameter of startIndex appended: public short ReadInt16(Endianness endianness) => BitConverter.ToInt16(ReadForEndianness(sizeof(short), endianness), 0); – Peter Wilson Jul 19 '20 at 02:55
  • @PeterWilson Are you trying to use this in .NET Framework? – Ian Kemp Jul 19 '20 at 18:57
  • that clearly works, but not so effective, as CPU do have single instruction for endianness conversion, instead of array manipulation. [bswap](https://c9x.me/x86/html/file_module_x86_id_21.html) – Bogdan Mart Aug 18 '20 at 12:08
  • 1
    Great solution, in case you are using .NET Core 2.1+, you can use [this](https://gist.github.com/akamud/18ea0da3385da9fd932580c90be54e3f) version that uses `BinaryPrimitives`, so the reversing is handled by the framework and it is more performant according to Stephen Toub https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-core-2-1/. – Mahmoud Ali Jan 11 '21 at 15:26
8

I'm not familiar with STFS, but changing endianess is relatively easy. "Network Order" is big endian, so all you need to do is translate from network to host order.

This is easy because there's already code that does that. Look at IPAddress.NetworkToHostOrder, as explained here: ntohs() and ntohl() equivalent?

Community
  • 1
  • 1
zmbq
  • 38,013
  • 14
  • 101
  • 171
6

In my opinion, you want to be careful doing this. The reason one would want to Convert from BigEndian to LittleEndian is if the bytes being read are in BigEndian and the OS calculating against them is operating in LittleEndian.

C# isn't a window only language anymore. With ports like Mono, and also other Microsoft Platforms like Windows Phone 7/8, Xbox 360/Xbox One, Windwos CE, Windows 8 Mobile, Linux With MONO, Apple with MONO, etc. It is quite possible the operating platform could be in BigEndian, in which case you'd be screwing yourself if you converted the code without doing any checks.

The BitConverter already has a field on it called "IsLittleEndian" you can use this to determine if the operating environment is in LittleEndian or not. Then you can do the reversing conditionally.

As such, I actually just wrote some byte[] extensions instead of making a big class:

    /// <summary>
    /// Get's a byte array from a point in a source byte array and reverses the bytes. Note, if the current platform is not in LittleEndian the input array is assumed to be BigEndian and the bytes are not returned in reverse order
    /// </summary>
    /// <param name="byteArray">The source array to get reversed bytes for</param>
    /// <param name="startIndex">The index in the source array at which to begin the reverse</param>
    /// <param name="count">The number of bytes to reverse</param>
    /// <returns>A new array containing the reversed bytes, or a sub set of the array not reversed.</returns>
    public static byte[] ReverseForBigEndian(this byte[] byteArray, int startIndex, int count)
    {
        if (BitConverter.IsLittleEndian)
            return byteArray.Reverse(startIndex, count);
        else
            return byteArray.SubArray(startIndex, count);

    }

    public static byte[] Reverse(this byte[] byteArray, int startIndex, int count)
    {
        byte[] ret = new byte[count];
        for (int i = startIndex + (count - 1); i >= startIndex; --i)
        {
            byte b = byteArray[i];
            ret[(startIndex + (count - 1)) - i] = b;
        }
        return ret;
    }

    public static byte[] SubArray(this byte[] byteArray, int startIndex, int count)
    {
        byte[] ret = new byte[count];
        for (int i = 0; i < count; ++i)            
            ret[0] = byteArray[i + startIndex];
        return ret;
    }

So imagine this example code:

byte[] fontBytes = byte[240000]; //some data loaded in here, E.G. a TTF TrueTypeCollection font file. (which is in BigEndian)

int _ttcVersionMajor = BitConverter.ToUint16(fontBytes.ReverseForBigEndian(4, 2), 0);

//output
_ttcVersionMajor = 1 //TCCHeader is version 1
Andy Mikula
  • 16,796
  • 4
  • 32
  • 39
Ryan Mann
  • 5,178
  • 32
  • 42
2

You better to use BinaryPrimitives class

        public override double ReadDouble()
        {
            return BinaryPrimitives.ReadDoubleBigEndian(ReadBytes(8));
        }

        public override short ReadInt16()
        {
            return BinaryPrimitives.ReadInt16BigEndian(ReadBytes(2));
        }

        public override int ReadInt32()
        {
            return BinaryPrimitives.ReadInt32BigEndian(ReadBytes(4));
        }

        public override long ReadInt64()
        {
            return BinaryPrimitives.ReadInt64BigEndian(ReadBytes(8));
        }

        public override float ReadSingle()
        {
            return BinaryPrimitives.ReadSingleBigEndian(ReadBytes(4));
        }

        public override ushort ReadUInt16()
        {
            return BinaryPrimitives.ReadUInt16BigEndian(ReadBytes(2));
        }

        public override uint ReadUInt32()
        {
            return BinaryPrimitives.ReadUInt32BigEndian(ReadBytes(4));
        }

        public override ulong ReadUInt64()
        {
            return BinaryPrimitives.ReadUInt64BigEndian(ReadBytes(8));
        }
Timur Vafin
  • 113
  • 1
  • 6
1

I've expanded on Ian Kemp's excellent suggestion, I'm using the new BinaryPrimitives, available in .NET Core 2.1+, they are more performant according to Stephen Toub's post and can handle the endianness and reversal internally.

So if you are running .NET Core 2.1+ you should definitely use this version:

public class EndiannessAwareBinaryReader : BinaryReader
{
    public enum Endianness
    {
        Little,
        Big,
    }

    private readonly Endianness _endianness = Endianness.Little;

    public EndiannessAwareBinaryReader(Stream input) : base(input)
    {
    }

    public EndiannessAwareBinaryReader(Stream input, Encoding encoding) : base(input, encoding)
    {
    }

    public EndiannessAwareBinaryReader(Stream input, Encoding encoding, bool leaveOpen) : base(
        input, encoding, leaveOpen)
    {
    }

    public EndiannessAwareBinaryReader(Stream input, Endianness endianness) : base(input)
    {
        _endianness = endianness;
    }

    public EndiannessAwareBinaryReader(Stream input, Encoding encoding, Endianness endianness) :
        base(input, encoding)
    {
        _endianness = endianness;
    }

    public EndiannessAwareBinaryReader(Stream input, Encoding encoding, bool leaveOpen,
        Endianness endianness) : base(input, encoding, leaveOpen)
    {
        _endianness = endianness;
    }

    public override short ReadInt16() => ReadInt16(_endianness);

    public override int ReadInt32() => ReadInt32(_endianness);

    public override long ReadInt64() => ReadInt64(_endianness);

    public override ushort ReadUInt16() => ReadUInt16(_endianness);

    public override uint ReadUInt32() => ReadUInt32(_endianness);

    public override ulong ReadUInt64() => ReadUInt64(_endianness);

    public short ReadInt16(Endianness endianness) => endianness == Endianness.Little
        ? BinaryPrimitives.ReadInt16LittleEndian(ReadBytes(sizeof(short)))
        : BinaryPrimitives.ReadInt16BigEndian(ReadBytes(sizeof(short)));

    public int ReadInt32(Endianness endianness) => endianness == Endianness.Little
        ? BinaryPrimitives.ReadInt32LittleEndian(ReadBytes(sizeof(int)))
        : BinaryPrimitives.ReadInt32BigEndian(ReadBytes(sizeof(int)));

    public long ReadInt64(Endianness endianness) => endianness == Endianness.Little
        ? BinaryPrimitives.ReadInt64LittleEndian(ReadBytes(sizeof(long)))
        : BinaryPrimitives.ReadInt64BigEndian(ReadBytes(sizeof(long)));

    public ushort ReadUInt16(Endianness endianness) => endianness == Endianness.Little
        ? BinaryPrimitives.ReadUInt16LittleEndian(ReadBytes(sizeof(ushort)))
        : BinaryPrimitives.ReadUInt16BigEndian(ReadBytes(sizeof(ushort)));

    public uint ReadUInt32(Endianness endianness) => endianness == Endianness.Little
        ? BinaryPrimitives.ReadUInt32LittleEndian(ReadBytes(sizeof(uint)))
        : BinaryPrimitives.ReadUInt32BigEndian(ReadBytes(sizeof(uint)));

    public ulong ReadUInt64(Endianness endianness) => endianness == Endianness.Little
        ? BinaryPrimitives.ReadUInt64LittleEndian(ReadBytes(sizeof(ulong)))
        : BinaryPrimitives.ReadUInt64BigEndian(ReadBytes(sizeof(ulong)));
}
Mahmoud Ali
  • 1,289
  • 1
  • 16
  • 31