2

The problem I am having is if I have the following dataframe that has been sorted by the Customer Number and Date columns, how do get the dates for each customer and corresponding salaries into new columns one by one.

    df = pd.DataFrame({'Customer Number':[1,1,1,2,2,3,3,3],
                'Date': [2013, 2014, 2015, 2016, 2017, 2013, 2014, 2015],
                'Salaries': [100, 200, 200, 300, 300, 4000, 5000, 5000]})

produces:

   Customer Number  Date  Salaries
0                1  2013       100
1                1  2014       200
2                1  2015       200
3                2  2016       300
4                2  2017       300
5                3  2013      4000
6                3  2014      5000
7                3  2015      5000

For this problem, the output I am trying to obtain is something like the following:

    Customer Number  Date 1  Salary 1  Date 2  Salary 2  Date 3  Salary 3
0                 1    2013       100    2014       200    2015       200
1                 2    2016       300    2017       300     NaN       NaN
2                 3    2013      4000    2014      5000    2015      5000

I have tried using pivot tables to arrange this data and numpy for reshaping the table but can't quite manage to get the new columns for dates and salaries in this way. I was hoping there would be a nice and simple solution to this but couldn't manage to find it. I'd really appreciate any help with this problem.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Mike P.
  • 109
  • 8
  • Try using the ```groupby``` function. Example : https://stackoverflow.com/a/27844045/5922920 – vbuzze May 24 '18 at 03:00

1 Answers1

1

cumcount

c = 'Customer Number'
df.set_index([c, df.groupby(c).cumcount() + 1]).unstack().sort_index(1, 1)

                   Date Salaries    Date Salaries    Date Salaries
                      1        1       2        2       3        3
Customer Number                                                   
1                2013.0    100.0  2014.0    200.0  2015.0    200.0
2                2016.0    300.0  2017.0    300.0     NaN      NaN
3                2013.0   4000.0  2014.0   5000.0  2015.0   5000.0
piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • Thank you! I'm extremely new to this and tried groupby before but guess I didn't understand its uses fully. – Mike P. May 24 '18 at 13:59