Rows being combined when trying to show first instance of grouped-by data

Question

I am trying to show the first row by group (in this case, Car is the group). When I try to do this with the data below, however, my code shows 45 for the time of Fred (which is actually Betsy's time from the row below). I would like the output to show the first full row for Car A & the first full row for Car B even if they have np.nan in the time column.

Can someone help me understand what I'm doing wrong and why my code would be combining row information like this?

Thanks!

import pandas as pd

test_df = pd.DataFrame({'Race':[1,1,1,2,2,2],'Car':['A','A','A','B','B','B'], 'Date':['5/1/2019','4/15/2019','3/1/2019','5/1/2019','2/1/2019','1/5/2019'],
                        'Driver':['Fred','Betsy','John','John','Frank','Frank'],'Time':[np.nan,45,46,47,44,43]})

test_df = test_df.sort_values(['Race', 'Car', 'Date'], ascending=[True, True, False]).groupby(['Car'], as_index=False).first()

This post has some more details: https://stackoverflow.com/questions/55583246/what-is-different-between-groupby-first-groupby-nth-groupby-head-when-as-index/55583395#55583395 — ALollz, May 02 '19 at 03:28

Ji Wei · Accepted Answer · 2019-05-02T02:43:05.960

0

Use .head(1) instead of .first():

Output:

   Race Car      Date Driver  Time
0     1   A  5/1/2019   Fred   NaN
3     2   B  5/1/2019   John  47.0

The difference between the two is how NaN is being treated: link.

edited May 02 '19 at 02:43

answered May 02 '19 at 02:37

Ji Wei

840
9
19

Thanks, Ji Wei! Sadly, I spent a lot of time trying to figure this out today and your response is great. – newcoder May 02 '19 at 03:06
Glad it helped, appreciate if you could 'accept' my answer. – Ji Wei May 02 '19 at 03:33

score 0 · Answer 2 · answered May 02 '19 at 02:37

0

Use nth(0,dropna=False) instead of first()

test_df = test_df.sort_values(['Race', 'Car', 'Date'], ascending=[True, True, False]).groupby(['Car'], as_index=False,).nth(0,dropna=False)

Output

    Race Car   Date    Driver   Time
0    1   A   5/1/2019  Fred     NaN
3    2   B   5/1/2019  John    47.0

answered May 02 '19 at 02:37

vb_rises

1,847
1
9
14

Great solution, Vishal. Thanks for taking time to answer my question. I tried to upvote but can't given my limited history on the site. Regardless, I appreciate your help! Thanks. – newcoder May 02 '19 at 03:06

Rows being combined when trying to show first instance of grouped-by data

2 Answers2