Actually am unsure if the end of this is cross-section because it's over a time period, but I think it is still.
I have a data frame that looks like this:
Player Finish Tournament Year id
------------------------------------------------
Aaron Baddeley 9 Memorial 2012 1
Aaron Baddeley 17 Masters 2013 1
Aaron Watkins 15 US Open 2012 2
Adam Scott 9 US Open 2014 3
Adam Scott 4 Memorial 2014 3
Alex Cejka 8 US Open 2010 4
Andres Romero 2 Memorial 2012 5
Andrew Svoboda 19 Memorial 2014 6
Andy Sullivan 13 Memorial 2015 7
I want to convert this data to single observations, with the desired output like this:
Player 2012_Memorial 2013_Memorial 2014_Memorial ... id
----------------------------------------------------------------------------
Aaron Baddeley 9 17 2012 1
Adam Scott NA NA 9 3
.
.
.
I've found the split-apply-combine
paradigm, which looks promising. But even on the surface, I've done df.groupby('id')
and a print statement outputs this:
Player Finish Tournament Year
id
1 Aaron Baddeley 9 Memorial 2012
2 Aaron Watkins 15 US Open 2012
3 Adam Scott 9 US Open 2014
So it seems to have collapsed the groups, but I've now lost data? Or how is the object now stored? I realize I haven't done the apply stage, which is probably how I will generate new rows and new columns, but I don't know the next step or if there's a cookbook example for something like this.
Thanks, Jared