I'm trying to extract a text out of a column so I can move to another column using a pattern in python but I miss some results at the same time I need to keep the unextracted strings as they are>
My code is:
import pandas as pd
df = pd.DataFrame({
'col': ['item1 (30-10)', 'item2 (200-100)', 'item3 (100 FS)', 'item4 (100+)', 'item1 (1000-2000)' ]
})
pattern = r'(\d+(\,[0-9]+)?\-\d+(\,[a-zA-Z])?\d+)'
df['result'] = df['col'].str.extract(pattern)[0]
print(df)
My output is:
col result
0 item1 (30-10) 30-10
1 item2 (200-100) 200-100
2 item3 (100 FS) NaN
3 item4 (100+) NaN
4 item1 (1000-2000) 1000-2000
My output should be:
col result newcolumn
0 item1 (30-10)
1 item2 (200-100)
2 item3 (100 FS)
3 item4 (100+)
4 item1 (1000-2000)