I have a question about the manipulation of a data frame. If I have this data frame as an example:
employee <- c('John Doe','Peter Gynn','Jolie Hope')
salary <- c(21000, 23400, 26800)
startdate <- as.Date(c('2010-11-1','2008-3-25','2007-3-14'))
location <- c('New York', 'Alabama','New York')
employ.data <- data.frame(employee, salary, startdate, location)
employ.data
employee salary startdate location
1 John Doe 21000 2010-11-01 New York
2 Peter Gynn 23400 2008-03-25 Alabama
3 Jolie Hope 26800 2007-03-14 New York
Now I want to transform the location into nummeric values. I know that I can do something like this:
transformlocation <- function(x) {
x <- as.character(x)
if (x =='New York'){
return('1')
}else if (x=='Alabama'){
return('2')
}else if (x=='Florida'){
return('3')
}else
return('0')
}
employ.data$location <- sapply(employ.data$location, transformlocation)
employ.data
employee salary startdate location
1 John Doe 21000 2010-11-01 1
2 Peter Gynn 23400 2008-03-25 2
3 Jolie Hope 26800 2007-03-14 1
But in my final dataset there are hundreds of different values. For example, is it possible to work with a for each statement here?
Thanks for your help!