I have this data frame:
-----------------------------------------------------
| age | gender | customer type | purchases | id |
+-------+----------+---------------+-----------+----|
| 38 | female | type 1 | 90 | 1 |
| 35 | female | type 2 | 100 | 2 |
| 71 | male | type 2 | 66 | 3 |
| 68 | female | type 3 | 12 | 4 |
| 26 | male | type 4 | 900 | 5 |
| 55 | male | type 5 | 71 | 6 |
| 27 | male | type 1 | 55 | 7 |
| ... | ... | ... | ... | ...|
+-------+----------+---------------+-----------+----+
I would like to get a split of train and test like 20% test 80% train for each customer type and with a similar distribution of age and gender because for example: If I get it for type 1, 80% of female it is not a good split.
I try to use a random module with a seed but I can't get it because I don't know how could I take into account the age and sex for the split.
Thank you!!