Editing Row Data in the Dataframe Pandas

Question

There is a column (spot_categories_name) in the dataframe like the one below. My goal is to get rid of the 'name' at the beginning and the parenthesis (}]") at the end. Briefly, I want to edit the following

Craftsman

BBQ

Theatre

Coffee Shop

...

`df['spot_categories_name'] = df['spot_categories_name'].map(lambda x: x.lstrip('\'name\': '))` see [here](https://stackoverflow.com/questions/13682044/remove-unwanted-parts-from-strings-in-a-column). Also, instead of pasting a picture, it would be helpful to see the dataframe pasted directly. — Life is Good, Jan 21 '21 at 18:06
It seems like this dataframe was generated inefficiently. You should try to generate a dataframe correctly in the first place. — Mitchell Olislagers, Jan 21 '21 at 18:08

score 3 · Answer 1 · answered Jan 21 '21 at 18:09

3

Use .str.extract():

df['spot_categories_name'] = df['spot_categories_name'].str.extract(r'\'name\': \'([^\']*)\'')

answered Jan 21 '21 at 18:09

noah

2,616
13
27

useful to see regex solution i actually think a bit more elegant than the way I proposed if it does indeed match the test cases. – oli5679 Jan 21 '21 at 18:21
ah, I missed "than" when I was ready quickly haha. – noah Jan 21 '21 at 19:07

score 1 · Accepted Answer · answered Jan 21 '21 at 18:07

If you use pandas .str.split method it can split your string into arrays wherever it meets this character.

You can then use .str[n] to get the nth entry in these arrays. In your case you can slit on :' and '} and then the last and first entries after split and it seems to match your test cases. Here is an example below.

import pandas as pd
test = pd.DataFrame(data = ["'name': 'Craftman'}]","'name': 'BBQ'}]"],columns=['spot_categories_name'])
test.spot_categories_name.str.split(": '").str[-1].str.split("'}").str[0]
print(test.to_dict())
#{'spot_categories_name': {0: "'name': 'Craftman'}]", 1: "'name': 'BBQ'}]"}}

Editing Row Data in the Dataframe Pandas

2 Answers2