2

I have a dataset with two columns, one is the emotions and the other the text that represents those emotions, which are sentences. I want to do split in those sentences and then create new rows with the emotions of those sentences but for each element/word individually

df = pd.DataFrame({
    'emotion': ['joy', 'fear', 'sadness'],
    'text': ['falling love', 'involved traffic accident', 'lost person']
})

df_result = pd.DataFrame({
    'emotion': ['joy', 'joy', 'fear', 'fear', 'fear', 'fear' 'sadness', 'sadness', 'sadness'],
    'text': ['falling', 'love', 'involved', 'traffic', 'accident', 'lost', 'person', 'meant']
})

What I tried?

save = pd.DataFrame(columns=['emotion', 'text'])
d = {}
for idx, row in df.iterrows():
    row_lst = (row['text']).split()
    for word in row_lst:
        word_lst = [word]
        d[row['emotion']] = word_lst
        print(d)
save.append(d)

I checked the related duplicated question, and is not the same question. Thus, is not a duplicated, one ask for columns and this one ask for rows.

Y4RD13
  • 937
  • 1
  • 16
  • 42

1 Answers1

0

Please Try

df=df.assign(text=df['text'].str.split('\s')).explode('text')
wwnde
  • 26,119
  • 6
  • 18
  • 32
  • 3
    `split('\s')` is almost always wrong because it creates empty strings if there is more than one space separator. Use either `split('\s+')` or simply `split()`. – DYZ Feb 01 '21 at 00:35