I want to generate a random/simulated data set with a specific distribution.
As an example the distribution has the following properties.
- A population of 1000
- The Gender mix is: male 49%, female 50%, other 1%
- The age has the following distribution: 0-30 (30%), 31-60 (40%), 61-100 (30%)
The resulting data frame would have 1000 rows, and two columns called gender and age (with the above value distributions)
Is there a way to do this in Pandas or another library?