I'm a newbie at python and I'm trying to do something like binning the data of a numpy array. I'm really struggling in doing so, tho!
My array is a simulation of a simple particle diffusion model, given their probabilities of walking forward or backward. It can have an arbitrary number of species of particles and the total number of particles and that information is coded in the key vector, which is a vector composed of numbers ranging from 0 to nSpecies. Each of these numbers appears according to a given proportion chosen by the user. The size of the vector is chosen by the user as well.
def walk(diff, key, progressProbability, recessProbability, nSpecies):
"""
Returns an array with the positions of the particles pondered by their
walk probabilities
"""
random = np.random.rand(len(key))
forward = key.astype(float)
backward = key.astype(float)
for i in range(nSpecies):
forward[key == i] = progressProbability[i]
backward[key == i] = recessProbability[i]
diff = np.add(diff, random < forward)
diff = np.subtract(diff, random > 1 - backward)
return diff
To add time into this simulation, I run this walk function presented above many times. Therefore, the values in diff after running this function many times are a representation of how far the particle has gone.
def probability_diffusion(time, progressProbability, recessProbability,
changeProbability, key, nSpecies, nBins):
populationSize = len(key)
diff = np.zeros(populationSize, dtype= int)
for t in range(time):
diff = walk(diff, key, progressProbability, recessProbability, nSpecies)
return diff
My goal is to turn this diff array in a array with size 381 without losing the information coded in it. I thought about doing so by binning and averaging the data in each bin.
I've tried using the scipy binned_statistic function but I can't really wrap my head around how it works.
Any thoughts? Thank you.