1

I'm trying to write and then read a file in a simple format: 3 lines for an ull (unsigned long long) and two float[].

I'm not sure why but I can't get the expected values.

vector<float> flowX{};
vector<float> flowY{};

void Save(string filename)
{
    ofstream fileTarget(filename);
    fileTarget << flowX.size() << endl;
    for (const auto& e : flowX) fileTarget << e;
    fileTarget << endl;
    for (const auto& e : flowY) fileTarget << e;
    fileTarget << endl;
    fileTarget.close();
}

size_t Read(string filename)
{
    ifstream fileTarget(filename);
    size_t size{};
    string firstLine;
    string xLine;
    string yLine;
    if (fileTarget.good())
    {
        getline(fileTarget, firstLine);
        size = stoll(firstLine);
        getline(fileTarget, xLine);
        getline(fileTarget, yLine);
    }
    fileTarget.close();
    auto* xArray = (float*)&xLine[0];
    auto* yArray = (float*)yLine.c_str(); //same as above
    vector<float> vX(xArray, xArray + size);
    vector<float> vY(yArray, yArray + size);
    flowX = vX;
    flowY = vY;
    return size;
}

I read (twice)

1.65012e-07 1.04306e-08 0.000175648 4.05171e-11 4.14959e-08 0.000175648 1.04297e-08 1.03158e-08

instead of the expected

3.141f, 2324.89f, 122.89f, 233.89f, 7.3289f, 128.829f, 29.8989f, 11781.8789f

In Read(), since string stores data in a contiguous memory, the reinterpretation to a float[] should be fine, shouldn't it ? (https://stackoverflow.com/a/1986974/1447389) The endianness should not have changed. I'm using VS2019 toolset 1.42 x64 with c++17.

Where am I doing wrong ?


EDIT

Checking if I'm writing in binary (with and without ios::binary I get the same file) with hexdump:

$ hd RawSave.sflow

00000000  38 0d 0a 35 34 2e 38 39  32 2e 38 39 36 2e 38 39  |8..54.892.896.89|
00000010  37 2e 38 39 31 38 2e 38  39 32 39 2e 38 39 31 31  |7.8918.8929.8911|
00000020  31 2e 38 39 0d 0a 35 34  2e 38 39 32 2e 38 39 36  |1.89..54.892.896|
00000030  2e 38 39 37 2e 38 39 31  38 2e 38 39 32 39 2e 38  |.897.8918.8929.8|
00000040  39 31 31 31 2e 38 39                              |9111.89|
00000047
Soleil
  • 6,404
  • 5
  • 41
  • 61
  • 1
    Is your file binary data? It doesn't seem so since you're using getline to read it and you haven't opened the file in binary mode. Assuming it is text then this won't work, you need to convert the text to a float as you are doing with the first line. – Retired Ninja Dec 15 '20 at 05:58
  • @RetiredNinja well, yes I write and read in binary, I have the same result (same file) with or without `ios::binary` – Soleil Dec 15 '20 at 06:02
  • I believe you are mistaken about your file being binary if you are reading it with `getline`. Add a few lines of it to your question. – Retired Ninja Dec 15 '20 at 06:04
  • 2
    @RetiredNinja You're right but not at the right points; I was not writing in binary, because of `fileTarget << e;` which should be `fileTarget.write(reinterpret_cast(&e), sizeof(float));` – Soleil Dec 15 '20 at 06:18
  • Just less/cat the file, it is text. The first char hd'ed in the file is 0x38, '8', and the second-third, "0d 0a" are "\r\n". – ChuckCottrill Dec 15 '20 at 17:03

3 Answers3

1

These two lines look right:

getline(fileTarget, firstLine);
size = stoll(firstLine);

But then you switched to trying to simply cast a string into a float. Which doesn't work. Try one of the strtof functions instead.

The line:

auto* xArray = (float*)&xLine[0];

Causes C++ to get the address of the series of bytes in the string and treat as a pointer to float (no manipulation of the data occurs). However, there is an inconsistency here. Strings are used to represent a series of ASCII or UNICODE characters depending on the compiler options chosen. However, float is a binary representation of a floating point number. So either you shouldn't be using GetLine and should be reading into an array of bytes (unsigned characters) and using the (float*) cast that you currently are. Or you should be using a function to convert your ASCII characters to a float.

Using GetLine() to read binary will result in the 'line' ending the first time a carriage return (decimal 13) or zero is encountered. Which makes no sense if reading binary data.

When you dumped the file showing:

00000000 38 0d 0a 35 34 2e 38 39 32 2e 38 39 36 2e 38 39 |8..54.892.896.89| 00000010 37 2e 38 39 31 38 2e 38 39 32 39 2e 38 39 31 31 |7.8918.8929.8911| 00000020 31 2e 38 39 0d 0a 35 34 2e 38 39 32 2e 38 39 36 |1.89..54.892.896| 00000030 2e 38 39 37 2e 38 39 31 38 2e 38 39 32 39 2e 38 |.897.8918.8929.8| 00000040 39 31 31 31 2e 38 39 |9111.89| 00000047

The file you are reading is not binary. It's a string followed by ASCII characters. "8" "54.892.869.897.8918.8929.89111... " visible on the right of your file dump.

Jeff Spencer
  • 507
  • 2
  • 11
  • I did not. I read as string , use it as uchar[], reinterpret as float[] (concatenate 4 uchar into 1 float). – Soleil Dec 15 '20 at 06:04
  • Please checkout my answer, it demonstrates that it was not the problem and and it's is fine to use `string` as a `byte[]` container. It's actually fine to read binary data as string since I know that the end of the array finishes with the byte `endl` or newline. That my file format. – Soleil Dec 15 '20 at 06:23
  • 2
    @Soleil-MathieuPrévot `finishes with the byte endl` Except the same byte `0x0A` can occur inside the binary representation of a `float` value. – dxiv Dec 15 '20 at 06:30
  • @dxiv there are no reason it can't occur, you're right, damn. – Soleil Dec 15 '20 at 06:33
1

The correct way to write was to use fileTarget.write() (which add charaters) instead of fileTarget << e (which add bytes):

//for (const auto& e : flowX) fileTarget << e; // adding text
for (const auto& e : flowX)
    fileTarget.write(reinterpret_cast<const char*>(&e), sizeof(float));

Checking with hd:

00000000  38 0a 25 06 49 40 3d 4e  11 45 ae c7 f5 42 d7 e3  |8.%.I@=N.E...B..|
00000010  69 43 59 86 ea 40 39 d4  00 43 f2 30 ef 41 84 17  |iCY..@9..C.0.A..|
00000020  38 46 0a 25 06 49 40 3d  4e 11 45 ae c7 f5 42 d7  |8F.%.I@=N.E...B.|
00000030  e3 69 43 59 86 ea 40 39  d4 00 43 f2 30 ef 41 84  |.iCY..@9..C.0.A.|
00000040  17 38 46 0a                                       |.8F.|
00000044

As mentionned by dxiv, the problem is that a float can contain an endl 0x0A; therefore using string to represent a raw float[] must not be used.

Subsequently, the proper way to write the two float[] is:

ofstream fileTarget(filename, ios::binary);
size_t size{ flowX.size() };
fileTarget.write(reinterpret_cast<const byte*>(&size), sizeof(size_t));
auto sizeBytes = flowX.size() * sizeof(float);
fileTarget.write(reinterpret_cast<const byte*>(&flowX.front()), sizeBytes);
fileTarget.write(reinterpret_cast<const byte*>(&flowY.front()), sizeBytes);

And the proper way to read them is :

ifstream fileTarget(filename, ios::binary);
size_t size{};
vector<float> vX{};
vector<float> vY{};
if (fileTarget.good())
{
    fileTarget.read(reinterpret_cast<byte*>(&size), sizeof(size_t));
    auto sizeBytes = size * sizeof(float);
    vX.assign(size, 0.0f);
    vY.assign(size, 0.0f);
    fileTarget.read(reinterpret_cast<byte*>(&vX.front()), sizeBytes);
    fileTarget.read(reinterpret_cast<byte*>(&vY.front()), sizeBytes);
}

As suggested by Retired Ninja, it reads and writes only in binary instead of mixte mode (text+binary), which is simpler and faster.

Soleil
  • 6,404
  • 5
  • 41
  • 61
  • If you decide to use a binary file then you don't need the newlines, and you can't read it with `getline`. – dxiv Dec 15 '20 at 06:28
  • @dxiv I just demonstrated that I was able to use getline to read my binary data; I needed the `endl` to signal the end of the float[], twice. Which I think leads to a much simpler binary reader (with the string/one line per float[] trick). – Soleil Dec 15 '20 at 06:31
  • 1
    All that demonstrates is that it may happen to work for a few chosen values. See my other comment [here](https://stackoverflow.com/questions/65300685/read-raw-as-string-then-reinterpret-as-float#comment115445126_65300771). – dxiv Dec 15 '20 at 06:32
  • 1
    You should use `data()` instead of `front()` and `resize()` instead of `assign()`. Kinda silly to keep using `getline` to read the size and then convert it. Use `read` directly into the `size` variable and the code is both more efficient and clearer. – Retired Ninja Dec 16 '20 at 05:54
0

You write the file in text. See your function Save(filename),

ofstream fileTarget(filename);
fileTarget << flowX.size() << endl;

Size is 8, followed by "\r\n"

Then you append a bunch of numbers with no separator, followed by a newline,

for (const auto& e : flowX) fileTarget << e;
fileTarget << endl;

Since you don't use a separator, how to you find the end of one number and start of another number?

Append a separator, "|",

for (const auto& e : flowX) fileTarget << e << "|";
fileTarget << endl;

for (const auto& e : flowY) fileTarget << e << "|";
fileTarget << endl;

Reading the lines, you need some way to scan the line and find separators and each number. (left as exercise for reader)

ChuckCottrill
  • 4,360
  • 2
  • 24
  • 42
  • flowX and flowY have the same size, which is the `ull` specified at the beginning, since it's an array of float, each number is 4 bytes; a separator doubles the size of the file (!). – Soleil Dec 15 '20 at 12:29