0

Let's say I have a dataframe that looks like this:

    interview       longitude        latitude
1   A1                  34.2             90.2
2   A1                  54.2             23.5
3   A3                  32.1             21.5
4   A4                  54.3             93.1
5   A2                  45.1             29.5
6   A1                  NaN              NaN
7   A7                  NaN              NaN
8   A1                  NaN              NaN
9   A3                  23.1             38.2
10  A5                  -23.7            -98.4

I would like to be able to perform some sort of groupby method that outputs each subgroup along with their respective longitude and latitude. So, desired output for something like this would be:

    interview         longitude         latitude 
1   A1                  34.2             90.2
2   A1                  54.2             23.5
6   A1                  NaN              NaN
8   A1                  NaN              NaN

5   A2                  45.1             29.5

3   A3                  32.1             21.5
9   A3                  23.1             38.2

... and so on

So this would need to be done in a loop, as I am going to need to iterate through each row of each subgroup.

My goal is to find, for each interview (A1, A2,...), which interviewer (A1, A2,...) had the longest distance traveled - essentially, i just need to be able to perform some calculations within each subgroup.. How would I go about performing this grouping method iteratively, so that I can again iteratively perform an operation within each subgroup,

Thanks!

sgerbhctim
  • 3,420
  • 7
  • 38
  • 60
  • Based on your output, this doesn't look like a groupby, this looks like a `df.sort_values(by=['interview','longitude','latitude'], ascending=False)`. For the use case you described, it makes more sense to add a column that calculates the distance (I'm assuming from/to some common lat long?) then do `df.groupby('interview').max()['distance']` – G. Anderson Jan 14 '19 at 21:51

1 Answers1

1

You can loop over the different groups in a GroupBy:

for name, group in df.groupby('interview'):
    # perform some operations on group
yatu
  • 86,083
  • 12
  • 84
  • 139