0

MY DATA (https://i.stack.imgur.com/7QNkp.png)

Suppose that we have these kind of table, I would like to create new column as named 'Number'. if 'sentence' column contains the word 'fud', create column includes just the numbers that are in the column 'sentence' after the word fud

I write sth. but I could not appointed the value according to the numbers after the word of 'fud'. Could you help me?

I tried it;

import pandas as pd
import numpy as nd
excel_file =  'SUT.xlsx'
df = pd.read_excel(excel_file)
df['Number'] = ''
df["Number"][df["Sentence"].str.contains("fud")] = **cant find what I will write**
print(df)

my expected result is like this (https://i.stack.imgur.com/AQ1zF.png)

SPC
  • 21
  • 3

1 Answers1

1

I think .str.extract is what you are looking for :

df['Number'] = df['Sentence'].str.extract(r'fud\s*(\d+)')

EDIT : You can add flags=re.IGNORECASE in the method to ignore the case :

df['Number'] = df['Sentence'].str.extract(r'fud\s*(\d+)',flags=re.IGNORECASE)

You can find more documentation on the extract method right here : https://pandas.pydata.org/docs/reference/api/pandas.Series.str.extract.html

nathan294
  • 26
  • 2
  • Yeapp, thank youuu. It is working however how can we add also 'Fud' with capital 'F' – SPC Apr 26 '23 at 13:22
  • Suppose that we have fud, FUD, Fud etc. in the column 'Sentence' – SPC Apr 26 '23 at 13:39
  • Is it possible to use regex to collect all FUDs in the column of 'Sentence'. Because in this answer we cant catch 'Fud' and 'FUD' and 'fUD' etc. ? – SPC Apr 26 '23 at 14:07
  • Just updated my answer to add the re.IGNORECASE flag – nathan294 Apr 26 '23 at 18:06