0

I have a collection of mainly numerical data-files that are the result of running a physics simulation (or several). I can convert the files into pandas dataframes. It is natural to organize the dataframe objects in lists, lists of lists etc. For example:

allData = [df1, [df11, df12], df2, [df21, df22]]

I want to save this data to files (to be sent). I know the whole thing can be dumped into one file with e.g. a pickle format, but I don't want this because some files can be large and I want to be able to load the files selectively. So each dataframe should be stored as a separate file.

But I also want to store how the objects are organized into lists, for example in another file. So that when reading the files from somewhere else, python will know how the data files are connected.

Possibly I could solve this by inventing some system of writing the filenames and how they are structured into a txt file. But is there a proper/cleaner way to do it?

Erik
  • 31
  • 6
  • Yes, one proper/cleaner way would be pickle which you dismissed. – mkrieger1 Jun 25 '21 at 07:21
  • Does this answer your question? [Easiest way to persist a data structure to a file in python?](https://stackoverflow.com/questions/1047318/easiest-way-to-persist-a-data-structure-to-a-file-in-python) – mkrieger1 Jun 25 '21 at 07:22
  • Sometimes you'll want to read everything. Other times you'll only be interested in certain parts of the data. This could be upwards of 100GB total. Do you @mkrieger1 mean that speed and efficiency will be the same for one big file, as for several smaller, even if you only need certain parts of the data? – Erik Jun 25 '21 at 07:26
  • To answer your question @mkrieger1 , no not exactly, as far as I can see that just uses one file. I guess the answer to the question in my previous comment decides whether I need a solution to my original question, or if I can just use one pickle file. – Erik Jun 25 '21 at 07:32
  • Have you considered using a database, for example SQLite? – mkrieger1 Jun 25 '21 at 07:46

0 Answers0