0

I have a quite large dataset (+150M rows) and I want to replace some values in a column with corresponding dictionary keys.

What I have looks like this:

col1
John
John Marvin
Lucas
Name:Lucas
Mary
Mary Surname

And I am trying to make it look like this

col1
John
John 
Lucas
Lucas
Mary
Mary

The number of different values in col1 is not that large so I thought in creating a dictionary assigning the correct values for all those that are odd.

d = {'John Marvin' = 'John', 'Name:Lucas' = 'Lucas', 'Mary Surname' = 'Mary'}

Given the lenght of my dataset I am trying to find a fast way to do this, does anyone have an idea of what could be a good way to do it?

Thanks!

0 Answers0