I am trying to write a function that will take a data.frame, a list (or a character vector) of variable names of the data.frame and create some new variables with names derived from the corresponding variable names in the list and values from the variables named in the list.
For example, if data.frame d has variable x, y, z, w, the list of names is c('x', 'z') the output maybe vectors with names x.cat, z.cat and values based on values of d$x and d$z.
I can do this with a loop
df <- data.frame(x = c(1 : 10), y = c(11 : 20), z = c(21 : 30), w = c(41: 50))
vnames <- c("x", "w")
loopfunc <- function(dat, vlst){
s <- paste(vlst, "cat", sep = ".")
for (i in 1:length(vlst)){
dat[s[i]] <- NA
dat[s[i]][dat[vlst[i]] %% 4 == 0 ] <- 0
dat[s[i]][dat[vlst[i]] %% 4 == 1 | dat[vlst[i]] %%4 == 3] <- 1
dat[s[i]][dat[vlst[i]] %% 4 == 2 ] <- 2
}
dat[s]
}
dout <- loopfunc(df, vnames)
This would output a 10x2 data.frame with columns x.cat and w.cat, the values of these are 0, 1, or 2 depending on the remainder of the corresponding values of df$x and df$w mod 4.
I would like to find a way to something like this without loop, maybe using the apply functions?
Here is a failed attempt
noloopfunc <- function(dat, l){
assign(l[2], NA)
assign(l[2][d[l[1]] %% 4 == 0], 0)
assign(l[2][d[l[1]] %% 4 == 2], 2)
assign(l[2][(d[l[1]] %% 4 == 1) | (d[l[1]] %% 4 == 3)], 1)
as.name(l[2])
}
newvnames <- sapply(vnames, function(x){paste(x, "cat", sep = ".")})
vpairs <- mapply(c, vnames, newvnames, SIMPLIFY = F)
lapply(vpairs, noloopfunc, d <- df)
Here the formal argument l is supposed to represent vpairs[[1]] or vpairs[[2]], both string vectors of length 2.
I found several threads on Stackoverflow on converting strings to variable names but I couldn't find anything where it is used in this way where the variables have to be referred to subsequently and assigned values in a non interactive way.
Thanks for any help.