1

I am implementing a MidiReader And it need me to read weather MSB First or LSB First UInts(8, 16, 32 or 64). I know little about binary and types so I'm currently copying other's code from C#.

class ByteArrayReader
{
public:
    unsigned char* ByteArray;
    unsigned int Size;
    unsigned int Index = 0;

    ByteArrayReader(unsigned char* byteArray)
    {
        if (byteArray == NULL)
        {
            throw byteArray;
        }
        ByteArray = byteArray;
        Size = (unsigned int)sizeof(byteArray);
        Index = 0;
    }

    char inline Read()
    {
        return ByteArray[Index++];
    }

    void inline Forward(unsigned int length = 1)
    {
        Index += length;
    }

    void inline Backward(unsigned int length = 1)
    {
        if (length > Index)
        {
            throw length;
        }

        Index -= length;
    }

    bool operator==(ByteArrayReader) = delete;
};

These are what I copied:


    uint16_t inline ReadUInt16()
    {
        return (uint16_t)((Read() << 8) | Read());
    }

    uint32_t inline ReadUInt32()
    {
        return (uint32_t)((((((Read() << 8) | Read()) << 8) | Read()) << 8) | Read());
    }


But it's said that one of it reads MSB First UInt. So I want to ask how to read UInt types from binaries elegantly, also learning how uint is represented in bytes.

Player01
  • 13
  • 3

2 Answers2

1

The part

(uint32_t)((((((Read() << 8) | Read()) << 8) | Read()) << 8) | Read());

is undefined behavior because each call to Read method increments a counter called Index and there is no strict order of computation of them by compiler.

It would be better if they were computed in order like this:

auto chunk1 = Read(); // Index=x
auto chunk2 = Read(); // Index=x+1
auto chunk3 = Read(); // Index=x+2
...
auto result = chunk1 << 8 | chunk2<<8 ...

to be sure incrementations are happening in order.

Order of bytes is different between little-endian and big-endian systems. Here it is asked: Detecting endianness programmatically in a C++ program

huseyin tugrul buyukisik
  • 11,469
  • 4
  • 45
  • 97
  • Thanks! It really helps but won't the compiler calls the inner one first? I copied it from C# maybe there are differences – Player01 Jan 22 '23 at 11:26
  • The MidiFormat has defined which part is MSB / LSB first so I don't need to detect whether it is or not – Player01 Jan 22 '23 at 11:29
  • When I compute ```ctr++ + ctr++ + ctr++ + ctr++``` on my system, it gives the expected answer but it doesn't mean that it will be same for other versions of compilers, other CPUs, other operating systems, etc. Even ```++(++(++(++c)))``` is undefined behavior right? – huseyin tugrul buyukisik Jan 22 '23 at 11:31
0

Try this:

uint32_t inline ReadUInt32MSBfirst()
{
    auto b1 = Read();
    auto b2 = Read();
    auto b3 = Read();
    auto b4 = Read();
    return (uint32_t)((b1 << 24) | (b2 << 16) | (b3 << 8) | b4);
}

uint32_t inline ReadUInt32LSBfirst()
{
    auto b1 = Read();
    auto b2 = Read();
    auto b3 = Read();
    auto b4 = Read();
    return (uint32_t)(b1 | (b2 << 8) | (b3 << 16) | (b4 << 24));
}
EddieLotter
  • 324
  • 3
  • 8