I want to filter my dataframe values based on the occurrence of '1' in my column events. When a 1 occurres, everything after the 1 should be removed.
This works when I have a singe list:
event = [5, 5, 5, 5, 1, 5]
index = event.index(1)
event[:index]
outputs:
[5, 5, 5, 5]
Now I want to do this for my whole dataframe, which looks like this:
| session_id | events |
|00000000000 | [4,5,5,3,2,1,5] |
|00000000001 | [4,5,5,1,2,1,5,5,5] |
|00000000002 | [4,5,1,3,2,1,5,5,5,1] |
import pandas as pd
df = pd.DataFrame([['00000000000 ', [4, 5, 5, 3, 2, 1, 5]],
['00000000001', [4, 5, 5, 1, 2, 1, 5, 5, 5]],
['00000000002 ', [4, 5, 1, 3, 2, 1, 5, 5, 5, 1]]],
columns=['session_id', 'events'])
But I cannot seem to get it right. Is someone able to help me? My final solution was to do this:
for i, row in df.iterrows():
target_id = row['events'].index(1)
df['events_short'] = row['events'][:target_id]
But it gives me the following error:
ValueError: Length of values (4) does not match length of index (10)