I've been reading about how it is best practice to avoid using iterrows to iterate through a pandas DataFrame, but I am not sure how else I can solve my particular problem:
How can I:
- Find the "time" of the first instance of the value "c" in one DataFrame, df1, grouped by "num" and sorted by "time"
- Then add that "time" into a separate DataFrame, df2, based on "num".
Here is an example of my input DataFrame:
import pandas as pd
df = pd.DataFrame({'num': [2, 2, 2, 2, 5, 5, 5, 5, 5, 5, 5, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8,
8, 8, 8, 8, 9, 9, 9, 9, 9],
'state': ['a', 'b', 'c', 'b', 'a', 'b', 'c', 'b', 'c', 'b', 'c', 'a',
'b', 'c', 'b', 'c', 'b', 'c', 'a', 'b', 'c', 'b', 'c', 'b',
'c', 'b', 'c', 'b', 'c', 'b'],
'time': [234, 239, 244, 249, 100, 105, 110, 115, 120, 125, 130, 3, 8,
13, 18, 23, 28, 33, 551, 556, 561, 566, 571, 576, 581, 45, 50,
55, 60, 65]})
Expected output (df2):
num time
2 244
5 110
7 13
8 561
9 50
Every solution I attempt seems like it would require iterrows to load the "time" into df2.