I have to simulate a hyper exponential distribution. I created this function that simulates it and stores the result in a dictionary used as histogram. Then it saves the histogram to a CSV file to see it in a spreadsheet (like Excel). It also returns the histogram.
import numpy as np
def simulate_hyper_exponential_distribution(p=0.617066, lambda1=0.051, lambda2=0.052, iterations=10 ** 5):
n = 0 # type: int
histogram = {} # type: dict[float, int]
for n in xrange(0, iterations):
lambda_used = lambda1
if np.random.uniform() >= p:
lambda_used = lambda2
random = float(round(np.random.exponential(1. / lambda_used), 1))
if random not in histogram:
histogram[random] = 1
else:
histogram[random] += 1
if sum(histogram.values()) != iterations:
print "Error!"
return
file = open('C:\\Users\\SteveB\\Desktop\\test_histogram.csv', 'w')
max_histogram_key = max(histogram.keys()) + 0.1 # type: int
# I think the error is in this for
for current in np.arange(0, max_histogram_key, 0.1):
if float(current) in histogram: # I think this is the line that fails
file.write(str(current) + ',' + str(histogram[current]) + '\n')
else:
file.write(str(current) + ',0\n')
file.close()
print 'Finished!'
return histogram
I run it with this line:
histogram = simulate_hyper_exponential_distribution(0.617066, 0.051, 0.052)
My problem is that the resultant CVS file has certain values in 0, and I know that those values don't have a 0. And most interesting is that through different executions the same values are the ones that are wrong in the file (i.e. 0.3, 0.6, 0.7, 1.2, 1.4, 1.7, 1.9, 2.3, 2.4). I type histogram[0.3]
(or any of the previously listed values) and I get a value different than 0.
For now I multiplied by 10 the key value and stored as int in the dictionary, and later, when writing this value in the file, divided it by 10, and this approach works. I don't know where the problem is when using float values. Thanks for your help.