I have a dataframe as follows:
name teamA teamB
foo a b
foo b c
foo c b
bar a e
bar a d
...
I want to find intersection of rows for each name separately but for both columns teamA and teamB. And after that to remove value of cell that contains that intersection value. In this example, for name "foo" intersection of rows would be "b", and for name "bar" would be "a". So data frame after removing this intersection values would look like:
name teamA teamB
foo a " "
foo " " c
foo c " "
bar " " e
bar " " d
...
Recently, I've tried with teamA and teamB as one column named for example teams.
name teams
foo [a, b]
foo [b, c]
foo [c, b]
...
after I would like to get
name teams
foo [a, " "]
foo [" ", c]
foo [c, " "]
...
But I've found it is more recommended to separate it in two columns and I found answer that is interesting but I don't know how to apply it on grouped data frame. https://stackoverflow.com/a/55554709/9168586 (look at "Filter on MANY Columns" section and "to retain rows where at least one column is True"). As in that example:
dataframe[['teamA', 'teamB']].isin('b').any(axis=1)
0 True
1 True
2 True
3 True
dtype: bool
where 'b' would be one of the values(teams) through which I would iterate. After every iteration if whole column is True I would remove that value from columns teamA or teamB in every row and continue to another group.
Errors that I get are:
Cannot access callable attribute 'isin' of 'DataFrameGroupBy' objects, try using the 'apply' method
and
only list-like or dict-like objects are allowed to be passed to DataFrame.isin(), you passed a 'str'