How to split sentences in a dataframe by dot to build another dataframe but keeping sentence ID

Question

I have a dataframe with ID and TEXTfield. I want to create another dataframe splitting the sentences in TEXT field by the dot and keeping the original ID

So the phrase: "I loves cats. I hate snakes" becomes two sentences in 2 rows in the new dataframe:

0 `I love cats`
0 `I hate snakes`

Original Dataframe:

ID                      TEXT
1    This is a msg. Another msg
2    The weather is hot, the water is cold. My hands are freezing

Transformed Dataframe:

ID
1      This is a msg
1      Another msg
2      The weather is hot, the water is cold
2      My hands are freezing

the code to build the dataframe:

df = pd.DataFrame({'ID':[1,2], 'TEXT':['This is a msg. Another msg', 'The weather is hot, the water is cold. My hands are freezing']})

I am trying to use split -> df['TEXT'].astype(str).split('.') but I keep getting errors because series objects has no split method.

score 1 · Accepted Answer · answered Sep 04 '22 at 18:05

1

You also need to set ID as index beforehand so that the exploded rows will have the respective IDs

df.set_index('ID', inplace=True)
split = df['TEXT'].str.split('.').explode()

answered Sep 04 '22 at 18:05

Nuri Taş

3,828
2
4
22

score 0 · Answer 2 · edited Sep 06 '22 at 14:37

0

Instead of df['TEXT'].astype(str).split('.')

try: df['TEXT'].str.split('.').explode()

edited Sep 06 '22 at 14:37

Jeru Luke

20,118
13
80
87

answered Sep 04 '22 at 18:03

gtomer

5,643
1
10
21

you suggestion works. Now i get a list [This is a msg, Another msg] and I need it in different rows, but keeping the ID. I am getting a hard time to do it ;-( – datashout Sep 04 '22 at 18:10
1

thanks @gtomer. I can't cast vote to your answer beacuse I am new the stackoverflow, but you helped a lot. Nuri tas answer solved my problem completely due the set_index operations that I was missing. Huge thanks tho. – datashout Sep 04 '22 at 18:33

How to split sentences in a dataframe by dot to build another dataframe but keeping sentence ID

2 Answers2