0

I'm currently working on python Dataframes, using Pandas. And I need to create a specific dataframes using another.

The first Dataframes looks like this

Index | Value
______|_______
0     | 1.1
0     | 0.3
1     | 1
2     | 0.2
2     | 3
2     | 1.3

I need to create a other dataframes, using groupby() and cumsum(). I want the cumsum() to be a vector.

The result should look like this :

Index | Value
______|_______
0     | [1.1 , 1.4]
1     | [1]
2     | [0.2 , 3.2 , 4.5]

But i can't find a way to use groupby() and cumsum() to do this right.

Does someone as a clue ?

  • 2
    Does this answer your question? [How to group dataframe rows into list in pandas groupby?](https://stackoverflow.com/questions/22219004/how-to-group-dataframe-rows-into-list-in-pandas-groupby) – Shashank Rawat Jul 06 '20 at 13:30

2 Answers2

1

Use custom lambda function with convert Series to list per groups after cumsum:

df = df.groupby('Index')['Value'].apply(lambda x: x.cumsum().tolist()).reset_index()
print (df)
   Index                      Value
0      0  [1.1, 1.4000000000000001]
1      1                      [1.0]
2      2            [0.2, 3.2, 4.5]

Also is possible use double groupby, in my opinion a bit overcomplicated:

df = (df.assign(Value=df.groupby('Index')['Value'].cumsum())
        .groupby('Index')['Value']
        .apply(list)
        .reset_index())
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

Another method is to set the index first and use two consecutive groupby's

df_cumsum = df.set_index('Index').groupby(level=0).cumsum().groupby(level=0).agg(list)

print(df_cumsum)

                            Value
Index                            
0       [1.1, 1.4000000000000001]
1                           [1.0]
2                 [0.2, 3.2, 4.5]
Umar.H
  • 22,559
  • 7
  • 39
  • 74