There is a column (spot_categories_name) in the dataframe like the one below. My goal is to get rid of the 'name' at the beginning and the parenthesis (}]") at the end. Briefly, I want to edit the following
Craftsman
BBQ
Theatre
Coffee Shop
...
There is a column (spot_categories_name) in the dataframe like the one below. My goal is to get rid of the 'name' at the beginning and the parenthesis (}]") at the end. Briefly, I want to edit the following
Craftsman
BBQ
Theatre
Coffee Shop
...
Use .str.extract()
:
df['spot_categories_name'] = df['spot_categories_name'].str.extract(r'\'name\': \'([^\']*)\'')
If you use pandas .str.split method it can split your string into arrays wherever it meets this character.
You can then use .str[n] to get the nth entry in these arrays. In your case you can slit on :' and '} and then the last and first entries after split and it seems to match your test cases. Here is an example below.
import pandas as pd
test = pd.DataFrame(data = ["'name': 'Craftman'}]","'name': 'BBQ'}]"],columns=['spot_categories_name'])
test.spot_categories_name.str.split(": '").str[-1].str.split("'}").str[0]
print(test.to_dict())
#{'spot_categories_name': {0: "'name': 'Craftman'}]", 1: "'name': 'BBQ'}]"}}