0

I have been given a task of translating the simulations inside of the Excel plug-in @Risk to Python. The functionalities closely line up with numpy's random number simulation given a distribution type and mu, sigma, or high and low values. An example for what I am doing is here.

In the linked example, mu=2 and sigma=1. Using numpy I get the same distribution as @Risk.

dist = np.random.lognormal(2, 1, 1000)

However, when I use numpy with the following parameters - I can no longer replicate the @Risk distributions.

mu=0.4, sigma=0.16 in @Risk: Histogram for 1000 rsamples

and in Python: histogram for 1000 rsamples

The result is two completely different distributions for the same mu and sigma. So I am very confused now on what numpy is expecting for mu and sigma inputs. I've read through the docs linked here, but why would one set of parameters give me the matching distributions, and another set of values will not.

What am I missing here?

ARuss
  • 3
  • 1
  • Sorry - please ignore the log normal truncated portion of the question. #facepalm – ARuss Jun 20 '18 at 13:15
  • One of your links explains that @RISK has two relevant functions, `RiskLognorm` and `RiskLognorm2`. Which one are you using? – Warren Weckesser Jun 20 '18 at 13:34
  • I am using RiskLognorm – ARuss Jul 19 '18 at 14:02
  • for a more practical example: in @risk, I use the formula Lognorm(0.40,0.16) (mu and sigma) and sample 1000 times. This results in a min=0.11,Max=1.49. Using python np.random.lognormal(mu, sigma, 1000) I get min=0.90, max=2.38 and mean=1.51. What?! – ARuss Jul 19 '18 at 14:18

1 Answers1

0

Take another look at the @RISK documentation that you linked to and the docstring for numpy.random.lognormal. The @RISK function whose parameters match those of numpy.random.lognormal is RiskLognorm2. The parameters for numpy.random.lognormal and RiskLognorm2 are the mean and standard deviation of the underlying normal distribution. In other words, they describe the distribution of the logarithm of the data.

The @RISK documentation explains that the parameters for RiskLognorm are the mean and standard distribution of the log-normal distribution itself. It gives the formulas for translating between the two methods of parametrizing the distribution.

If you are sure that the parameters in the @RISK code are correct, then you will have to translate those parameters to the form used by numpy.random.lognormal. Given the values mean and stddev as the parameters used by RiskLognorm, you can get the parameters mu and sigma of numpy.random.lognormal as follows:

sigma_squared = np.log((stddev/mean)**2 + 1)
mu = np.log(mean) - 0.5*sigma_squared
sigma = np.sqrt(sigma_squared)

For example, suppose the mean and std. dev. of the distribution are

In [31]: mean = 0.40

In [32]: stddev = 0.16

Compute mu and sigma:

In [33]: sigma_squared = np.log((stddev/mean)**2 + 1)

In [34]: mu = np.log(mean) - 0.5*sigma_squared

In [35]: sigma = np.sqrt(sigma_squared)

Generate a sample using numpy.random.logormal, and check its statistics:

In [36]: sample = np.random.lognormal(mu, sigma, size=1000)

In [37]: np.mean(sample)
Out[37]: 0.3936244646485409

In [38]: np.std(sample)
Out[38]: 0.16280712706987954

In [39]: np.min(sample), np.max(sample)
Out[39]: (0.1311593293919604, 1.7218021130668375)
Warren Weckesser
  • 110,654
  • 19
  • 194
  • 214