0

I have a LARGE pandas dataframe which looks like this:

"name" "price" "quantity"
"s1"   2       5
"s2"   3       7
"s3"   9       2
"s1"   5       10
"s2"   8       1
etc    etc     etc

I want to make a new dataframe which looks like this:

"name" "price" "quantity"
"s1"   [2, 5]  [5, 10]
"s2"   [3, 8]  [7, 1]
etc    etc     etc

Thanks in advance!

apaderno
  • 28,547
  • 16
  • 75
  • 90
Lindau
  • 97
  • 1
  • 8

1 Answers1

1

Use pd.groupby with lambda function for the aggregation:

df.groupby('"name"').agg(lambda x: tuple(x))

Output:

        "price" "quantity"
"name"                   
"s1"    (2, 5)    (5, 10)
"s2"    (3, 8)     (7, 1)
"s3"      (9,)       (2,)
Yuca
  • 6,010
  • 3
  • 22
  • 42
  • 1
    Works perfectly! Thank you so much for that quick answer! You don't know how much I have looked for this lol, thanks again! – Lindau Dec 07 '18 at 20:07
  • no problem, glad I could help – Yuca Dec 07 '18 at 20:07
  • 1
    `df.groupby('"name"').agg(tuple)` – user3483203 Dec 07 '18 at 20:17
  • I get valueError when I try the df.groupby('"name"').agg(tuple) command, "no results". Do I do something wrong? – Lindau Dec 07 '18 at 20:35
  • hmm double check that you're using the right column name, they usualy don't have quotes on them (if your column name is *name* then `df.groupby('name')` – Yuca Dec 07 '18 at 20:42
  • I can't get it to work. Anyways, the old solution worked well, so I think I go for that one. Thanks and have a nice night! =) – Lindau Dec 07 '18 at 20:46
  • well that's useful feedback, on my end it works. Maybe it's a versioning issue. gn :) – Yuca Dec 07 '18 at 21:00