I have a set of experiment days and subjects (anonymised subset below) in a dataframe. How do I generate all the pairwise comparisons per day in a new dataframe where subjects alse play the role of experimenter?
Input:
Day | Subject |
---|---|
Monday | Alpha |
Monday | Bravo |
Monday | Charlie |
Wednesday | Delta |
Wednesday | Echo |
Wednesday | Foxtrot |
Wednesday | Golf |
Wednesday | Hotel |
Expected Output:
Day | Subject | Experimenter |
---|---|---|
Monday | Alpha | Bravo |
Monday | Alpha | Charlie |
Monday | Bravo | Charlie |
Wednesday | Delta | Echo |
Wednesday | Delta | Foxtrot |
Wednesday | Delta | Golf |
Wednesday | Delta | Hotel |
Wednesday | Echo | Foxtrot |
Wednesday | Echo | Golf |
Wednesday | Echo | Hotel |
Wednesday | Foxtrot | Golf |
Wednesday | Foxtrot | Hotel |
Wednesday | Golf | Hotel |
So far, I an only able to generate the total set of combinations but not by day!
import numpy as np
import pandas as pd
import itertools as it
df = pd.DataFrame({'Day': ['Monday', 'Monday', 'Monday', 'Wednesday', 'Wednesday', 'Wednesday', 'Wednesday', 'Wednesday'],
'Subject': ['Alpha', 'Bravo', 'Charlie', 'Delta', 'Echo', 'Foxtrot', 'Golf', 'Hotel']})
pair_order_list = it.combinations(df['Subject'], 2)
pairs = list(pair_order_list)
Actual Output
[('Alpha', 'Bravo'), ('Alpha', 'Charlie'), ('Alpha', 'Delta'),...]
Any advice would be welcome?