3

I saw this question, grouping rows in list in pandas groupby

but I have more than two columns that I want to apply list.

input:

df(pd.DataFrame)

| index | c1 | c2 | c3 |
|-------|----|----|----|
|     1 | A  |  6 |  1 |
|     2 | A  |  5 |  2 |
|     3 | B  |  4 |  3 |
|     4 | B  |  3 |  4 |
|     5 | B  |  2 |  5 |
|     6 | C  |  1 |  6 |

expected output:

| c1 |    c2   |    c3   |
|----|---------|---------|
| A  | [6,5]   | [1,2]   |
| B  | [4,3,2] | [3,4,5] |
| C  | [1]     | [6]     |

I also tryed

df.groupby('c1').apply(list)

but it results as the following.

| c1 |             |
|----|-------------|
| A  | ['c2','c3'] |
| B  | ['c2','c3'] |
| C  | ['c2','c3'] |

How do I do?

Thanks.

hrsma2i
  • 4,045
  • 6
  • 15
  • 24
  • 1
    Its not really good practice to make lists as DataFrame elements. You might want to think about solving your problem in a different way. – Ted Petrou Jun 02 '18 at 03:38

1 Answers1

4

It is well know issue with apply with list

df.groupby('c1').agg(lambda x : list(x))
Out[15]: 
           c2         c3
c1                      
A      [6, 5]     [1, 2]
B   [4, 3, 2]  [3, 4, 5]
C         [1]        [6]
BENY
  • 317,841
  • 20
  • 164
  • 234
  • This is cool... – hrsma2i Jun 02 '18 at 03:44
  • this is great when there's just numbers, but in cases where the column has strings, the list(x) thing goes another level in and splits those strings into lists of a letter each. So 'apple','ball' become `a,p,p,l,e,b,a,l,l` instead of `['apple','ball']`. – Nikhil VJ Jul 19 '18 at 04:05