1

Let's say I have the following dataframe df. How can I take in a value N and return only rows where N columns have the same value? For example if N=3, it would return rows 0,2,3,4. If N=4, then only row 3.

      'A'   'B'   'C'  'D'   'E'

0      1     1     1    3     5
1      5     4     2    1     2
2      3     4     3    2     3
3      5     5     5    4     5
4      1     2     1    2     1

I've found answers like this one that are for cases when all values are the same, but can't think of a clean way to adapt it for when it wants an arbitrary number of columns to be the same.

NeonBlueHair
  • 1,139
  • 2
  • 9
  • 22

1 Answers1

1

We can using value_counts, ge mean >=, you can change number 3 in it to what you need

df[df.apply(pd.value_counts,1).ge(3).any(1)]
Out[257]: 
   'A'  'B'  'C'  'D'  'E'
0    1    1    1    3    5
2    3    4    3    2    3
3    5    5    5    4    5
4    1    2    1    2    1
BENY
  • 317,841
  • 20
  • 164
  • 234