4

I have a pd dataframe which looks like this:

    keyword     |    ranks    |search_type | search_volume
0   keyword1    |[{'rank': 1}]| 1          | {'search_volume': 10}
1   keyword1    |[{'rank': 1}]| 2          |{'search_volume': 10}
2   keyword2    |[{'rank': 1}]| 1          |{'search_volume': 390}
3   keyword2    |[{'rank': 1}]| 2          |{'search_volume': 390}
4   keyword3    |[{'rank': 1}]| 1          |{'search_volume': 170}
...

Columns ranks and search_volume should be integers containing numbers only, and I'm trying to find a way to remove [{'rank': , {'search_volume': and and closing brackets, so the table looks like:

    keyword     | ranks |search_type | search_volume
0   keyword1    |   1   |   1        |10
1   keyword1    |   1   |   2        |10
2   keyword2    |   1   |   1        |390
3   keyword2    |   1   |   2        |390
4   keyword3    |   1   |   1        |170
...

I've tried this: df['ranks'].replace('[{\'rank\':','',inplace=True) however it didn't do anything. also this is not the quickest way of solving this problem.

I've had a look at this thread Pandas DataFrame: remove unwanted parts from strings in a column, this solution is for one column at a time (it would be good to strip out out all unwated strings at once) and it returns this error: AttributeError: 'list' object has no attribute 'lstrip'.

I'm using python 3.

jpp
  • 159,742
  • 34
  • 281
  • 339
jceg316
  • 469
  • 1
  • 9
  • 17

2 Answers2

3

This is one way using pd.Series.apply:

df['ranks'] = df['ranks'].apply(lambda x: x[0]['rank'])
df['search_volume'] = df['search_volume'].apply(lambda x: x['search_volume'])

This assumes your ranks series contains lists, and your search_volume series contains dictionaries.

jpp
  • 159,742
  • 34
  • 281
  • 339
3

Use apply:

df['ranks'] = df['ranks'].apply(lambda x: x[0]['rank'])
df['search_volume'] = df['search_volume'].apply(lambda x: x[0]['search_volume'])

BONUS

This one will work in your case, to make it a one-liner:

df[['ranks', 'search_volume']] = df[['ranks', 'search_volume']].applymap(lambda x: x[0].values()[0])
zipa
  • 27,316
  • 6
  • 40
  • 58