1

I am generating row vectors in a loop. For example:

import random
for x in range(100):
    print([random.randint(0,10), random.randint(0,10), random.randint(0,10)])

How do I append these rows, one at a time (or in a predefined chunk size) to an (initially empty) array that is saved to file (for example using HDF5 or similar)? I know the number of columns the array will need but do not know in advance how many rows the final array will have.

user20862
  • 31
  • 4

1 Answers1

0
import random

vec_array = []

for x in range(100):
    row = [random.randint(0,10), random.randint(0,10), random.randint(0,10)]
    print(row)
    vec_array.append(row)

print(vec_array)

I'm not sure what you mean by "array that is saved". You're exporting this array somewhere? To save this array in JSON, you could use:

import json

with open('output.json','w+') as f:
    json.dump({'vec_array':vec_array},f)

To load that data again, you would simply run:

import json

with open('output.json') as f:
    vec_array = json.load(f)['vec_array']

Further reading:
https://docs.python.org/3/library/json.html

For much larger datasets, a SQL database would be suitable:
https://docs.python.org/3/library/sqlite3.html

If you are certain you wish to use HDF5, you will have to resize your dataset if you pass the maximum size.
https://docs.h5py.org/en/stable/high/dataset.html#reading-writing-data
https://docs.h5py.org/en/stable/high/dataset.html#resizable-datasets

geofurb
  • 489
  • 4
  • 13
  • My Intention is to save the vectors to a file saved on disk. I have updated the question to make it clearer – user20862 May 28 '21 at 10:45
  • Do you have a specific format you intend to use? JSON, CSV, pickle? I see you added the hdf5 tag, but that seems like overkill for this task. – geofurb May 28 '21 at 10:46
  • the array could get very large, hence the need to save as I go along rather than hold in memory. I was told HDF5 was good for that but I am open to whatever would do the job. – user20862 May 28 '21 at 10:51
  • Ahh! Those are definitely details you would want to put in the original question. – geofurb May 28 '21 at 10:55
  • Try this: https://stackoverflow.com/questions/25655588/incremental-writes-to-hdf5-with-h5py?rq=1 – geofurb May 28 '21 at 10:56