1

I'm new to python and trying to use the numpy.random triangular function to run a series of Monte Carlo simulations from several triangular distributions and then append the simulation outputs from each run. The sample data is as below.

ID  Low  Mode  High  
A   10   15    25
B   7    20    22
C   2    18    20
D   1    4     5
E   13   25    34

I would like to run 10000 runs for each ID and append the results. I know I can run for each ID for example ID A using np.random.triangular(10, 15, 25, 10000). May need to write for loop to run and append all IDs. Thank you!

Update!

The expected output format is:

ID Run      Output
A   1       11
A   2       23
.
.
.
A   10000   18
B   1       21.5
B   2       9
.   .       .
.   .       .
.   .       .
B   10000   19
C   1       2.5
C   2       13
.   .       .
.   .       .
.   .       .
BMM
  • 63
  • 7
  • Could you give an example of the output you want so I know how to write the answer? – DPM Oct 28 '20 at 18:40
  • @DPM - I updated my post with the expected output. Thank you! – BMM Oct 28 '20 at 18:51
  • Isn't it simpler if there is a list for every ID? Also are you using the library pandas? – DPM Oct 28 '20 at 18:59
  • @DPM - I prefer one dataframe with all the output so that I can summarize (like mean) for each ID in one table. I have more than 50 IDs in my dataset. – BMM Oct 28 '20 at 19:05
  • Ok, I'll edit my answer to take that into consideration – DPM Oct 28 '20 at 19:08
  • The run column doens't make mauch sense because the numpy function gives you a sample, it only runs once – DPM Oct 28 '20 at 19:17
  • I have written the code you want but it takes a long time to run, think carefully if you really need the run column because it is unnecessary – DPM Oct 28 '20 at 20:01
  • @DPM - the Run column is not a must as long as I get the other two columns (ID and Output) as an output. Thank you! – BMM Oct 28 '20 at 20:08
  • Ok, I'll just check if the code is write and then upload the answer – DPM Oct 28 '20 at 20:09
  • One thing I suggest is instead of a row per run have a row per index and in each row the output will be an array of the sample – DPM Oct 28 '20 at 20:12
  • I think that should be fine. – BMM Oct 28 '20 at 20:15
  • @DPM - Thank you! I tried your code and got the following error message raise TypeError("data argument can't be an iterator") TypeError: data argument can't be an iterator ... the error occurs when you assign df. – BMM Oct 28 '20 at 20:34
  • It's because of zip. Check this https://stackoverflow.com/questions/45388800/python-data-argument-cant-be-an-iterator/45388890 – DPM Oct 28 '20 at 21:14
  • I didnt have that error – DPM Oct 28 '20 at 21:14
  • 1
    @DPM - thanks! It works now. I modified it to get what I wanted. Thanks again! – BMM Oct 29 '20 at 18:03

1 Answers1

1

df is a dataframe with your data and in the cycle I am iterating through the rows, so basically each row of the new dataframe will have an array with the 10000 samples.

import pandas as pd
import numpy as np
Low = [10,7,2,1,13]
Mode = [15,20,18,4,25]
High = [25,22,20,5,34]
ID = ['A', 'B', 'C', 'D', 'E']
df = pd.DataFrame(zip(Low, Mode, High), columns = ['Low', 'Mode', 'High'], index = ID)
cols = ['Output']
        
df2 = pd.DataFrame(columns=cols, index = ID)

for l in range(5):
    result = np.random.triangular(df.iloc[l][0], df.iloc[l][1], df.iloc[l][2], 10000)
    df2.iloc[l][0] = result

An example of the output:

enter image description here

DPM
  • 845
  • 7
  • 33