I currently run a simulation several times and want to save the results of these simulations so that they can be used for visualizations.
The simulation is run 100 times and each of these simulations generates about 1 million data points (i.e. 1 million values for 1 million episodes), which I now want to store efficiently. The goal within each of these episodes would be to generate an average of each value across all 100 simulations.
My main
file looks like this:
# Defining the test simulation environment
def test_simulation:
environment = environment(
periods = 1000000
parameter_x = ...
parameter_y = ...
)
# Defining the simulation
environment.simulation()
# Save simulation data
hf = h5py.File('runs/simulation_runs.h5', 'a')
hf.create_dataset('data', data=environment.value_history, compression='gzip', chunks=True)
hf.close()
# Run the simulation 100 times
for i in range(100):
print(f'--- Iteration {i} ---')
test_simulation()
The value_history
is generated within game()
, i.e. the values are continuously appended to an empty list according to:
def simulation:
for episode in range(periods):
value = doSomething()
self.value_history.append(value)
Now I get the following error message when going to the next simulation:
ValueError: Unable to create dataset (name already exists)
I am aware that the current code keeps trying to create a new file and generates an error because it already exists. Now I am looking to reopen the file created in the first simulation, append the data from the next simulation and save it again.