I have a Pandas dataframe that looks like this:
+---+--------+-------------+------------------+
| | ItemID | Description | Feedback |
+---+--------+-------------+------------------+
| 0 | 8988 | Tall Chair | I hated it |
+---+--------+-------------+------------------+
| 1 | 8988 | Tall Chair | Best chair ever |
+---+--------+-------------+------------------+
| 2 | 6547 | Big Pillow | Soft and amazing |
+---+--------+-------------+------------------+
| 3 | 6547 | Big Pillow | Horrific color |
+---+--------+-------------+------------------+
And I want to concatenate the values from the "Feedback" column into a new column, separated by commas, where the ItemID matches. Like so:
+---+--------+-------------+----------------------------------+
| | ItemID | Description | NewColumn |
+---+--------+-------------+----------------------------------+
| 0 | 8988 | Tall Chair | I hated it, Best chair ever |
+---+--------+-------------+----------------------------------+
| 1 | 6547 | Big Pillow | Soft and amazing, Horrific color |
+---+--------+-------------+----------------------------------+
I've tried several variations of pivot, merge, stacking, etc. and am stuck.
I think the NewColumn would end up being an array but I'm fairly new to Python so I'm not certain.
Also, ultimately, I'm going to try and use this for text classification (for a new "Description" generate some "Feedback" labels [multiclass problem])