1

I have this in pandas:

    a   b   c
0   A   1   6
1   A   2   5
2   B   3   4
3   B   4   3
4   B   5   2
5   C   6   1

and I want to tranform it to this:

    a   b          c
0   A   [1, 2]     [6, 5]
1   B   [3, 4, 5]  [4, 3, 3]
2   C   [6]        [1]

What is the most efficient way to do this?

Outcast
  • 4,967
  • 5
  • 44
  • 99

1 Answers1

0

Ok so it is:

df = df.groupby('a').agg({'b': list, 'c':list}).reset_index()

If there is anything better then you can let me know.

Outcast
  • 4,967
  • 5
  • 44
  • 99
  • [`df.groupby('a').agg(list)`](https://stackoverflow.com/a/55839464/4909087) is shorter. You can fix the index as needed. – cs95 Jun 07 '19 at 15:30
  • you can make a function that returns a series of dict. This dict contains your transformed columns (`b`and `a`). The transformation must be some kind of aggregation on these columns. and then you just `apply` your function after a `groupby`on your dataframe. – Siddhant Tandon Jun 07 '19 at 15:42
  • `def f(x):return pd.Series( (dict(b = x['b'].tolist() ,c = x['c'].tolist() )) `. Then just `df.groupby(..).apply(f)`. You can also do a `mean`or `max`on cols you want to transform – Siddhant Tandon Jun 07 '19 at 15:47
  • @SiddhantTandon That is a lot more work than necessary. – cs95 Jun 07 '19 at 16:08
  • @cs95 yeah but this way is a generalized approach of doing just any kind of aggregation on your columns, not just converting to lists. I mean to say it gives you an option to apply any transformation on your columns. – Siddhant Tandon Jun 07 '19 at 21:47