Short version first, long version will follow :
Short :
I have a 2D matrix of float32. I want to write it to a .txt file as bytearray. I also want to keep the structure, which means adding a newline character at the end of a row. Some numbers like 683.61, when converted to bytearray include \n which produces an undesired newline character and messes up the reading ot the file as lines. How can I do this?
Long :
I am writing a program to work with huge arrays of datas (2D matrix). For that purpose, I need the array stored on disk rather then on my ram as the datas might be too big for the ram of the computer. I created my own type of file which is going to be read by the program. It has a header with important parameter as bytes followed by the matrix as bytearrays.
As I write the datas to the file one float32 at a time, I add a newline (\n) character at the end of one row of the matrix, so I keep the structure.
Writing goes well, but reading causes issues as some numbers, once converted to byte array, include \n.
As an example :
struct.pack('f',683.61)
will yield
b'\n\xe7*D'
This cuts my matrix rows as well as sometimes cut in the middle of a bytearray making the bytearray size wrong.
From this question : Python handling newline and tab characters when writing to file
I found out that a str can be encoded with 'unicode_escape' to double the backslash and avoid confusion when reading.
Some_string.encode('unicode_escape')
However, this method only works on strings, not bytes or bytearrays. (I tryed it) This means I can't use it when I directly convert a float32 to a bytearray and write it to a file.
I have also tryed to convert the float to bytearray, decode the bytearray as a str and reencode it like so :
struct.pack('f',683.61).decode('utf-8').encode('unicode_escape')
but decode can't decode bytearrays.
I have also tryed converting the bytearray to string directly then encoding like so :
str(struct.pack('f',683.61)).encode('unicode_escape')
This yields a mess from which it is possible to get the right bytes with this :
bytes("b'\\n\\xe7*D'"[2:-1],'utf-8')
And finally, when I actually read the byte array, I obtain two different results wheter the unicode_escape has been used of not :
numpy.frombuffer(b'\n\xe7*D', dtype=float32)
yields : array([683.61], dtype=float32)
numpy.frombuffer(b'\\n\\xe7*D', dtype=float32)
yields : array([1.7883495e+34, 6.8086554e+02], dtype=float32)
I am expecting the top restults, not the bottom one. So I am back to square one.
--> How can I encode my matrix of floats as a bytearray, on multiple lines, without being affected by newline character in the bytearrays?
F.Y.I. I decode the bytearray with numpy as this is the working method I found, but it might not be the best way. Just starting to play around with bytes.
Thank you for you help. If there is any issue with my question, please inform me, I will gladly rewrite it properly if it was wrong.