I have a pandas dataframe that looks like this:
| Cliid | Segment | Insert |
|-------|---------|--------|
| 001 | A | 0 |
| 002 | A | 0 |
| 003 | C | 0 |
| 004 | B | 1 |
| 005 | A | 0 |
| 006 | B | 0 |
I want to split it into 2 groups in a way that each group has the same composition of each variable in [Segment, Insert]. For example, each group would have 1/2 of the observations belonging to segment A, 1/6 of Insert = 1, and so on.
I've checked this answer, but it only stratifies for one variable, it won't work for more than one.
R has this function that does exactly that, but using R is not an option.
By the way, I'm using Python 3.