I have a DataFrame
like this:
a b c
0 0 0.326783 1
1 1 0.356272 1
2 2 0.797407 1
3 3 0.098846 1
4 4 0.528812 1
5 5 0.913114 1
6 6 0.630039 2
7 7 0.475828 2
8 8 0.619713 2
9 9 0.756735 2
10 10 0.168544 2
11 11 0.337957 3
12 12 0.201395 3
13 13 0.272564 3
14 14 0.757490 3
15 15 0.032135 4
16 16 0.598143 4
17 17 0.150696 4
18 18 0.001403 4
19 19 0.427624 4
Then, I want to sample it, randomly, in 3 subgoups, given their proportions (ex.[0.5, 0.3, 0.2]
, but respecting the proportion of labels in column c
I tried a recursion with df.groupby('c').sample(frac=...)
, sampling one group, and then another, etc...
The problem is that one group didn't get a label c=3
What is the best way of doing it, respecting both given proportions of the subgroups (my [0.5, 0.3, 0.2]
list mentioned above) and also proportions of label c inside each of the sampled subgroups?