I am trying to show the first row by group (in this case, Car is the group). When I try to do this with the data below, however, my code shows 45 for the time of Fred (which is actually Betsy's time from the row below). I would like the output to show the first full row for Car A & the first full row for Car B even if they have np.nan in the time column.
Can someone help me understand what I'm doing wrong and why my code would be combining row information like this?
Thanks!
import pandas as pd
test_df = pd.DataFrame({'Race':[1,1,1,2,2,2],'Car':['A','A','A','B','B','B'], 'Date':['5/1/2019','4/15/2019','3/1/2019','5/1/2019','2/1/2019','1/5/2019'],
'Driver':['Fred','Betsy','John','John','Frank','Frank'],'Time':[np.nan,45,46,47,44,43]})
test_df = test_df.sort_values(['Race', 'Car', 'Date'], ascending=[True, True, False]).groupby(['Car'], as_index=False).first()