I'm writing a python script for the data acquisition phase of my project and so far I've been storing data in .csv files. While I was reading data from a particular .csv file, I got the error:
syntaxError: EOL while scanning string literal
I took a look at the specific row in the file and and the data in the specific cell were truncated. I am using pandas to store dicts to csv and it never threw an error. I guess .csv will save itself no matter what, even if that means it will delete data without any warning.
I thought of changing to .xls. When the same row was being stored, an error came up saying (something along the lines of):
Max character length reached. Max character length per cell was ~32k.
Then I thought that it may just be an excel/libreoffice calc issue (I tried both) and they can't visualize the data in cell but they are actually there. So I tried printing the specific cell; data were indeed truncated. The specific cell contains a dict, whose values are float, int, boolean or string. However, all of them have been converted to strings.
My question is, is there a way to fix it without changing the file format?
In the case that I have to change the file format, what would be an appropriate choice to store very large files? I am thinking about hdf5.
In case you need more info, do let me know. Thank you!