I have a dataframe like this:
id company ......
111 A
222 B
333 B
111 E
444 C
555 C
555 C
333 A
111 A
222 D
444 C
and I would like to get the rows where the id
occurs in the same company
at least twice. So the result would be:
id company .......
111 A
444 C
555 C
555 C
111 A
444 C
Although id 222
was there twice it was with a different company so it is removed. id 111
was there 3 times but only twice with the same company
. So only the 2 rows from that company
remain. And so on.
Rows can occur with the same company more than twice.
There are some stackoverflow questions which deal with selecting rows where a value appears more than once (such as How to select rows in Pandas dataframe where value appears more than once) but I cannot find any that deal with an index + column pair occuring more than once.