-2

Suppose I have input as below,

df = pd.DataFrame( {
   'A': [1,1,1,1,2,1,2,1,3,4,4,4],
   'B': [5,5,6,7,5,6,6,7,7,7,7,7],
   'C': [1,1,1,1,1,1,1,1,1,1,1,1]
    } );



I need expected output to be as below:
 df
        A  B  C flag
    0   1  5  1  1
    1   1  5  1  1
    2   1  6  1  0
    3   1  7  1  0
    4   2  5  1  0
    5   2  6  1  0
    6   1  6  1  0
    7   3  7  1  0
    8   3  7  3  0
    9   4  7  1  1
    10  4  7  1  1
    11  4  7  1  1

I want to flag if the data frame has >= 2-row data is repetitive. Could please help me out here. I have copied data from here

Also suppose I want to flag if the data frame >= n repetitive data and n could vary 2 to 10.

Community
  • 1
  • 1
Pramod_S
  • 1
  • 2

1 Answers1

1

IIUC, you can try:

s = df.eq(df.shift()).all(1)
df['flag'] = (s | s.shift(-1)).astype(int)

OUTPUT:

    A  B  C  flag
0   1  5  1     1
1   1  5  1     1
2   1  6  1     0
3   1  7  1     0
4   2  5  1     0
5   1  6  1     0
6   2  6  1     0
7   3  7  1     0
8   3  7  3     0
9   4  7  1     1
10  4  7  1     1
11  4  7  1     1

df used:

df = pd.DataFrame( {
   'A': [1,1,1,1,2,1,2,3,3,4,4,4],
   'B': [5,5,6,7,5,6,6,7,7,7,7,7],
   'C': [1,1,1,1,1,1,1,1,3,1,1,1]})
Nk03
  • 14,699
  • 2
  • 8
  • 22
  • suppose I want to flag if repetitive greater than n, Will there be a minor change in the code. n could vary between 2 to 10. – Pramod_S Jul 06 '21 at 19:12