0

I have a dataframe of this shape:

anger fear joy love sadness surprise thankfulness disgust guilt
0   1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1   1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2   1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3   1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4   1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

You can notice how the emotions labels ['anger', 'fear', 'joy', 'love', 'sadness', 'surprise', 'thankfulness', 'disgust', 'guilt'] are formatted in a column-based way, what I want to do is, instead of multiple columns, we add another column (emotion) such that it stores the emotion labels based on a predefined dictionary:

{0: 'anger',
 1: 'fear',
 2: 'joy',
 3: 'love',
 4: 'sadness',
 5: 'surprise',
 6: 'thankfulness',
 7: 'disgust',
 8: 'guilt'}

Example output (different dataframe):

    Emotion Text
0   joy During the period of falling in love, each tim...
1   fear    When I was involved in a traffic accident.
2   anger   When I was driving home after several days of...
3   sadness When I lost the person who meant the most to me.
4   disgust The time I knocked a deer down - the sight of ...

I'm looking for a more of a Pandas-way to do this

Ali H. Kudeir
  • 756
  • 2
  • 9
  • 19
  • 3
    I'm guessing you would only want certain values to appear in the `emotion` column based on whether the were "on" or not in the One-Hot encoding? Can you update your DataFrame to be a bit smaller (less text) and more diverse (not just all anger 1) as well as expected output (what does `emotion` look like). I'm sure this question is answerable in its current state, but it could be made significantly easier to answer both in clarity and more diverse sample data. – Henry Ecker Jun 14 '21 at 22:50
  • For example, instead of 9 columns and then use 1, 0 to denote the emotion, I need to replace the columns with one column named emotion that stores the emotion label with value of 1 – Ali H. Kudeir Jun 14 '21 at 22:55
  • 1
    Brute force, but can you just multiply each column by the desired code and sum each row? For example `df["emotion"] = 0*df["anger"] + df["fear"] + 2*df["joy"]` – Davis Jun 15 '21 at 00:30
  • 1
    Does this answer your question? [Reversing 'one-hot' encoding in Pandas](https://stackoverflow.com/questions/38334296/reversing-one-hot-encoding-in-pandas) – Davis Jun 15 '21 at 00:34
  • 1
    Also if you just want a label, and to deal with mulitple 1s on a single row, very simple to do by dotting your matrix with all powers of 2: https://stackoverflow.com/a/63197019/4333359 – ALollz Jun 15 '21 at 01:14
  • I ended up solving it with this answer: https://stackoverflow.com/a/63196805/9213600, thanks a lot @ALollz – Ali H. Kudeir Jun 16 '21 at 01:52
  • This specific answer was better since it helped to get the exact labels even if there were two or more labels for each instance, https://stackoverflow.com/a/44879458/9213600 – Ali H. Kudeir Jun 16 '21 at 02:00

0 Answers0