I have a data frame users
with a column id
and country
id country
1 France
2 United States
3 France
I want to add a new column salary
which depends on the average salary
for a given country
.
My first thought was to create a config vector with (country, salary)
like this :
salary_country <- c(
"France"=45000,
"United States"=50000,
...)
And then to create the column like this (using dplyr
) :
tbl_df(users) %>%
mutate(salary = ifelse(country %in% names(salary_country),
salary_country[country],
0))
It runs like a charm. If the country does not exist in my salary_country
vector, the salary
is equal to 0 else it's equal to the given salary
.
But, it is quite slow on a very large data frame and quite verbose.
Is there a better way to accomplish that ?