How to remove different strings from several columns in pandas

Question

I have a pd dataframe which looks like this:

    keyword     |    ranks    |search_type | search_volume
0   keyword1    |[{'rank': 1}]| 1          | {'search_volume': 10}
1   keyword1    |[{'rank': 1}]| 2          |{'search_volume': 10}
2   keyword2    |[{'rank': 1}]| 1          |{'search_volume': 390}
3   keyword2    |[{'rank': 1}]| 2          |{'search_volume': 390}
4   keyword3    |[{'rank': 1}]| 1          |{'search_volume': 170}
...

Columns ranks and search_volume should be integers containing numbers only, and I'm trying to find a way to remove [{'rank': , {'search_volume': and and closing brackets, so the table looks like:

    keyword     | ranks |search_type | search_volume
0   keyword1    |   1   |   1        |10
1   keyword1    |   1   |   2        |10
2   keyword2    |   1   |   1        |390
3   keyword2    |   1   |   2        |390
4   keyword3    |   1   |   1        |170
...

I've tried this: df['ranks'].replace('[{\'rank\':','',inplace=True) however it didn't do anything. also this is not the quickest way of solving this problem.

I've had a look at this thread Pandas DataFrame: remove unwanted parts from strings in a column, this solution is for one column at a time (it would be good to strip out out all unwated strings at once) and it returns this error: AttributeError: 'list' object has no attribute 'lstrip'.

I'm using python 3.

Try: `df['ranks'].apply(lambda x: x[0]["rank"])` – Rakesh May 29 '18 at 15:34 — Rakesh, May 29 '18 at 15:34

score 3 · Accepted Answer · answered May 29 '18 at 15:33

This is one way using pd.Series.apply:

df['ranks'] = df['ranks'].apply(lambda x: x[0]['rank'])
df['search_volume'] = df['search_volume'].apply(lambda x: x['search_volume'])

This assumes your ranks series contains lists, and your search_volume series contains dictionaries.

zipa · Answer 2 · 2018-05-29T15:43:26.133

3

Use apply:

df['ranks'] = df['ranks'].apply(lambda x: x[0]['rank'])
df['search_volume'] = df['search_volume'].apply(lambda x: x[0]['search_volume'])

BONUS

This one will work in your case, to make it a one-liner:

df[['ranks', 'search_volume']] = df[['ranks', 'search_volume']].applymap(lambda x: x[0].values()[0])

edited May 29 '18 at 15:43

answered May 29 '18 at 15:33

zipa

27,316
6
40
58

How to remove different strings from several columns in pandas

2 Answers2