I am shuffling an array of, say, 8760 numbers sorted by their respective values (from low to high) to generate a quasi-stochastic time series. However, I want higher values to have a higher chance of appearing within the first quarter and last of the resulting array and lower values within the second and third quarter. My questions are:
- Is there a way to manipulate the shuffle function so it works with custom probabilities or do I have to "do it myself" afterwards?
- Is there some other package I do not know yet which can do this?
- Am I possibly blind and overlooking another much easier way to do this?
a = np.array([0, 0, 0, 0, 0, ...
1, 1, 1, ...
...
14, 14, 14, 14, 14, 14])
a_shuff = random.shuffle(a)
# desired resultwould be something like
a_shuff = [14, 14, 8, 12, ... 0, 4, 2, 6, 3, ... 13, 14, 9, 11, 12]
It may be important to note that each value has a different number of occurances within the array.
I hope that describes my problem well enough - I am new to both Python and Stackoverflow. I'm happy to answer any further questions on this matter.
SOLUTION
By sorting my values as suggested in the answers and applying increasing probability values to each of them along the axis (whereas sum(p) must equal unity), I was able to successfully use Numpy's Random Choice function. This may not be an answer to the question i asked, however it does the same thing (at least in this specific case):
#convert list to array (list was necessary previously) v_time = np.empty(0) for r in range(len(temp)): v_time = np.append(v_time, temp[r]) #sort values by desired probablity - this step may vary depending on desired #trend in shuffled data arrayA = v_time[0::2] arrayB = v_time[1::2] arrayB = np.flip(arrayB) v_time = np.concatenate((arrayB, arrayA)) #create probability values for customizing your weights p = np.linspace(0.01, 1, len(v_time)) p = p / sum(p) #shuffle array v_timeShuff = np.random.choice(v_time, v_time.size, False, p)