1

I want to fill the following struct:

struct Data{
    char id; // 1 byte
    char date[7]; // 7 bytes
    short int Ny; // 2 bytes
    short int Nx; 
    short int L;
    unsigned char distance;
    short int N;
    std::vector<short int> quant_levels;
    std::vector<std::vector<unsigned char>> Pixels;
};

Based on the information on how to decode my binary file:

Byte           Example                Description
0              1                      Format id (1)
1-7            31-Mar-1998 11:55      Date/Time
18-19                                 Number of rows (short int LSB first) (=NY)
20-21                                 Number of columns (=NX)
22                                    Number of 3D-levels (=L)
23-24                                 distance(m) between the 3D-levels (short int LSB first)
25                                    Number of quantisation levels (=N)
26-26+n*2-1                           quantisation levels (N * short int LSB first) in 1/10mm/h
X1-Y1                                 Pixels (NX*NY Bytes) from upper-left to down-right on the 1st 3D-Level
X2-Y2                                 Pixels (NX*NY Bytes) from upper-left to down-right on the 2nd 3D-Level
XL-YL                                 Pixels (NX*NY Bytes) from upper-left to down-right on the Lth 3D-Level

I want to save fill the struct while reading my binary file so I have implemented this, which is not finished because I do not know how to get a short int out of 2 bytes.

Data readFile(const char* filename)
{
    Data data{};
    // open the file:
    std::fstream fh;
    fh.open(filename, std::fstream::in | std::fstream::binary);

    // read the data:
    fh.read((char*)&data.id, sizeof(char));
    fh.read((char*)&data.date, sizeof(data.date));
    fh.read((char*)&data.Ny, sizeof(data.Ny)); // WRONG, how can I move to byte 18?
    // TODO: How to continue

    return data;
}

EDIT

If there is a better way to get the data let me know, I am not restricted to use a struct.

  • 2
    As far as I can tell, none of that code is relevant to the question and can be safely removed. You'll probably get better answers if you rephrase this as something like "how to convert byte array to int assuming little-endian encoding". – Norrius Jun 04 '20 at 13:42
  • 2
    Your file has a structure, like starting with header and following with data. So you should not use raw buffer that makes very little sense in the context of decoding. Make structures and read the file into structures. Bytes should correspond to char arrays, numbers should correspond to short/long int. If needed make endian conversions. Then read data, it should correspond to array of structures. – armagedescu Jun 04 '20 at 13:54
  • @armagedescu can u give me a small example based on the code I have edited? – Carlos Plaza Jun 04 '20 at 14:00
  • @CarlosPlaza Please check here for usage of structure: http://www.cplusplus.com/doc/tutorial/structures/ and see here about little/big endian conversion https://stackoverflow.com/questions/105252/how-do-i-convert-between-big-endian-and-little-endian-values-in-c – armagedescu Jun 04 '20 at 14:07
  • I know the structs but still do not understand how can I convert from my binary to int for example. The little endian link does not work for me – Carlos Plaza Jun 04 '20 at 14:19
  • You should not convert from binary, that is the thing to do. You should not read into bytes, like you do in readFile. You should create a structure corresponding to the file header and read it into structure. – armagedescu Jun 04 '20 at 14:22
  • Thanks a lot, I know what you mean but the problem is that I do not know how to do that. I have edited the question with the struct but then I do not know hoy to go over the binary file and save progressively the data into my struct – Carlos Plaza Jun 04 '20 at 14:37
  • Your previous version with 2 bytes was actually easier to answer. LSB in this context stands for Least Significant Byte, meaning a [little-endian byte order](https://chortle.ccsu.edu/AssemblyTutorial/Chapter-15/ass15_3.html). So bytes `08 03` become `0x0308` or `776`. Now to read directly into a struct is possible, but won't be portable - it will depend on machine endianness and struct packing/alignment. Note also the gap between where date/time ends (offset 8) and the next item begins (18), or was date/time actually 17 bytes long? – rustyx Jun 04 '20 at 15:25
  • Thanks @rustyx so what do you recommend me to do? Moreover how could I obtain this 776? – Carlos Plaza Jun 04 '20 at 15:29
  • To obtain 776 you can simply do `vec[18] | (vec[19] << 8)` – rustyx Jun 04 '20 at 15:30
  • And to implemented it while reading the binary file? I mean where does you vec come from? And the 8? you are assuming 2 bytes per each vec position – Carlos Plaza Jun 04 '20 at 15:33
  • `vec` is the vector of bytes from your original post from before the edit. 8 is the number of bits in a byte (always 8). – rustyx Jun 04 '20 at 15:37
  • So you recommend me to read directly to bytes and then make conversions? – Carlos Plaza Jun 04 '20 at 15:38
  • It depends. If you only care about Intel x86, then reading directly into a struct is easier. But if you want to support ARM, Sparc, PPC etc. then you should take care of byte order. – rustyx Jun 04 '20 at 15:52
  • And can you show me how to do the 776 thing reading the binary at the moment? I have the LSB as char s = '8' and the other byte as char s2 = '\x03' but doing data.Ny = s | (s2<< 8); gives me 824 – Carlos Plaza Jun 04 '20 at 16:01
  • Ah yes, '8' is actually a char '8', ascii hex 0x38. So it will become 0x338, or 824 indeed. – rustyx Jun 04 '20 at 16:10
  • But why 0x338 and not 0x38? I am concatenating '0x3' with '8' isnt it? – Carlos Plaza Jun 04 '20 at 16:13
  • The debugger showed you ASCII representation, `'8'` and `'\x03'`. `'8'` is `'\x38'` so it becomes `38 03`, or 0x338. – rustyx Jun 04 '20 at 16:18

1 Answers1

1

Solution 1 (portable)

Read the header into a byte vector and convert each value one-by-one.

For example:

// read the header
std::vector<unsigned char> vec(26);
fh.read((char*)&vec.data(), vec.size());

data.Ny = vec[18] | (vec[19] << 8);
data.Nx = vec[20] | (vec[21] << 8);
data.L = vec[22];
data.distance = vec[23] | (vec[24] << 8);
. . .

Solution 2 (x86 only)

x86 is little-endian and not sensitive to data alignment, so we can read directly into a (packed) struct:

#include <cstdint>

#pragma pack(push, 1) // to prevent padding inside the struct
struct Header {
    uint8_t id;
    char date[17];
    uint16_t Ny;
    uint16_t Nx; 
    uint8_t L;
    uint16_t distance;
    uint8_t N;
};
#pragma pack(pop)

struct Data {
    Header hdr;
    std::vector<int16_t> quant_levels;
    std::vector<std::vector<unsigned char>> pixels; // [level][y][x]
};

Data readFile(const char* filename)
{
    Data data{};
    // open the file:
    std::fstream fh(filename, std::fstream::in | std::fstream::binary);

    // read the header
    fh.read((char*)&data.hdr, sizeof(data.hdr));

    data.quant_levels.resize(data.N);
    fh.read((char*)data.quant_levels.data(), data.N * 2);

    data.pixels.resize(data.L);
    for (auto& level : data.pixels) {
        level.resize(data.Nx * data.Ny);
        fh.read((char*)level.data(), data.Nx * data.Ny);
    }

    return data;
}

Note that there is a typo in your spec, bytes 1-7 should probably be 1-17 (note the next value starts at 18 and also 31-Mar-1998 11:55 is longer than 7 bytes).

rustyx
  • 80,671
  • 25
  • 200
  • 267