Slicing a Data frame by checking consecutive elements

Question

I have a DF indexed by time and one of its columns (with 2 variables) is like [x,x,y,y,x,x,x,y,y,y,y,x]. I want to slice this DF so Ill get this column without same consecutive variables- in this example :[x,y,x,y,x] and every variable was the first in his subsequence.

Still trying to figure it out...

Thanks!!

score 2 · Accepted Answer · answered May 14 '18 at 00:02

2

Assuming you have df like below

df=pd.DataFrame(['x','x','y','y','x','x','x','y','y','y','y','x'])

We using shift to find the next is equal to the current or not

df[df[0].shift()!=df[0]]
Out[142]: 
    0
0   x
2   y
4   x
7   y
11  x

answered May 14 '18 at 00:02

BENY

317,841
20
164
234

Cornelis · Answer 2 · 2018-05-14T00:50:42.677

0

You jsut try to loop through and safe the last element used

df=pd.DataFrame(['x','x','y','y','x','x','x','y','y','y','y','x'])
df2=pd.DataFrame()

old = df[0].iloc[0] # get the first element
for column in df:
    df[column].iloc[0] != old:
        df2.append(df[column].iloc[0])
        old = df[column].iloc[0]

EDIT:

Or for a vector use a list

>>> L=[1,1,1,1,1,1,2,3,4,4,5,1,2]
>>> from itertools import groupby
>>> [x[0] for x in groupby(L)]
[1, 2, 3, 4, 5, 1, 2]

edited May 14 '18 at 00:50

answered May 14 '18 at 00:10

Cornelis

1,065
8
23

Please don't use loops with DataFrames, that's like mixing vegetable oil in gasoline. – cs95 May 14 '18 at 00:12
@COLDSPEED I mean if you want to use dataframes for what can be done by a list then that is on you. Added a simple solution using lists. – Cornelis May 14 '18 at 00:49

Slicing a Data frame by checking consecutive elements

2 Answers2