Transform same column row data with multiple columns data to new dataframe

Question

I have a LARGE pandas dataframe which looks like this:

"name" "price" "quantity"
"s1"   2       5
"s2"   3       7
"s3"   9       2
"s1"   5       10
"s2"   8       1
etc    etc     etc

I want to make a new dataframe which looks like this:

"name" "price" "quantity"
"s1"   [2, 5]  [5, 10]
"s2"   [3, 8]  [7, 1]
etc    etc     etc

Thanks in advance!

*Why* do you want to do this? You shouldn't store lists in a dataframe — user3483203, Dec 07 '18 at 20:16
What should I do then? I'm new to pandas and I'm trying to learn =) — Lindau, Dec 07 '18 at 20:27
I think the format you have it stored in now is perfectly reasonable. Much easier to access and query for sure. — user3483203, Dec 07 '18 at 20:28

Yuca · Accepted Answer · 2018-12-07T21:16:00.583

1

Use pd.groupby with lambda function for the aggregation:

df.groupby('"name"').agg(lambda x: tuple(x))

Output:

        "price" "quantity"
"name"                   
"s1"    (2, 5)    (5, 10)
"s2"    (3, 8)     (7, 1)
"s3"      (9,)       (2,)

edited Dec 07 '18 at 21:16

answered Dec 07 '18 at 20:03

Yuca

1

Works perfectly! Thank you so much for that quick answer! You don't know how much I have looked for this lol, thanks again! – Lindau Dec 07 '18 at 20:07
no problem, glad I could help – Yuca Dec 07 '18 at 20:07
1

`df.groupby('"name"').agg(tuple)` – user3483203 Dec 07 '18 at 20:17
I get valueError when I try the df.groupby('"name"').agg(tuple) command, "no results". Do I do something wrong? – Lindau Dec 07 '18 at 20:35
hmm double check that you're using the right column name, they usualy don't have quotes on them (if your column name is *name* then `df.groupby('name')` – Yuca Dec 07 '18 at 20:42
I can't get it to work. Anyways, the old solution worked well, so I think I go for that one. Thanks and have a nice night! =) – Lindau Dec 07 '18 at 20:46
well that's useful feedback, on my end it works. Maybe it's a versioning issue. gn :) – Yuca Dec 07 '18 at 21:00

1 Answers1