I want to create a function that allows me to input a data frame with a varying number of columns, and to create two new columns:
- one based on a logical comparison of all others and
- one based on a logical comparison of all others and the first new column.
A minimal example would be a data set with two variables:
V1 <- c(1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0)
V2 <- c(0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0)
Data <- data.frame(V1, V2)
I want to create the two new columns with a function looking like this:
my.spec.df <- function(data, variables, new.var.name){
new.df <- data
# First new column
new.df[[new.var.name]] <- 0
new.df[[new.var.name]][new.df$V1 == Lag(new.df$V1, 1) & new.df$V2 == Lag(new.df$V2, 1)] <- 1 # I want my logical comparison to be applicable to all variables listed in [[variables]], not just V1 and V2 used here as minimal example
# Second new column
new.df$Conj.Var.[[new.var.name]] <- 0 # I want this second new column to take the name "Conj.Var."+the name of the first new variable, which I tried to achieve with the [[]] but it did not work (same in the next row)
new.df$Conj.Var.[[new.var.name]][new.df$V1 == 1 & new.df$V2 == 1 & new.df[[new.var.name]] == 1] <- 1 # Again, I want the logical comparison to be applicable to all variables listed [[variables]] and the first newly created column
return(new.df)
}
spec.df <- my.spec.df(Data,
variables=c("V1", "V2"),
new.var.name="NV1")
The new data frame should look like:
print(spec.df)
V1 V2 NV1 Conj.Var.NV1
1 1 0 0 0
2 0 1 0 0
3 1 1 0 0
4 1 1 1 1
5 0 0 0 0
6 0 1 0 0
7 1 0 0 0
8 1 0 1 0
9 0 0 0 0
10 0 1 0 0
11 0 1 1 0
12 1 1 0 0
13 1 0 0 0
14 0 1 0 0
15 0 0 0 0
As commented in the code, I struggle with three things:
- apply the logical comparisons for the first new column to all variables listed (not just the two as in my minimal example) because the number could go from one variable listed to multiple ones,
- format the name of the second new column based on the name introduced for the first and
- apply the logical comparison for the second new column also to all variables listed.
Anyone that could help? Many thanks in advance!