I'm using Python Pandas Dataframe for Data Analyse of some logs. I have a csv with something like: number_items event_type ... ... ... session_id ... ... ...
My problem is that in my session there are different types of events, and only one of them has something for number_items. Or, numbers_items is what interests me.
So what I want to see is how each parameter of each event influences the number_items.
So, what I want to do is: Copy the number_items of the event that has it (always the last one in the session) to all the other events of the session. Separate each event_type in a different Dataframe (to avoid a lot of nulls that exist only because the attribute doesn't correspond to the event) and analyse it.
I'm blocked at the first part
I tried something like this:
currentSession = '0'
currentItems = 0
for index, row in reversed(df.iterrows()) :
if row['session_id'] == currentSession :
row['number_items'] = currentItems
else :
currentSession = row['session_id']
currentItems = row['number_items']
Obviously, it's not working, I just wanted to show the idea.
I'm kind of new in Python, so I would appreciate some help.
Thanks
edit: data sample here
For security reasons, I let only the relevant information