Want to select:
select
user_id,
max(case when value > 0 then timestamp else 0 end) as max_timestamp_when_value_is_positive
from df
group by user_id
What is the right way to aggregate?
groupped = raw_data.groupby('user_id')
res = groupped.agg({<how-to-do-described-aggregation?>})
UPDATE Explanation and example.
In [2]: df = pd.DataFrame({'user_id': [1, 1, 1, 2, 2, 3, 3, 3, 3],
'timestamp': [100, 200, 300, 10, 110, 10, 110, 210, 250],
'value': [0, 1, 0, 0, 0, 0, 10, 0, 1]})
In [3]: groupped = df.groupby('user_id')
In [4]: res = groupped.agg({'timestamp': [min, max],
'value': lambda x: sum(x > 0),
<described-magic>})
In [5]: res
Out[5]:
timestamp value <...magic...>
min max <lambda>
user_id
1 100 300 1 200
2 10 110 0 0
3 10 250 2 210
Magic is what I want.