I have a following dataframe:
varnames<-c("ID", "a.1", "b.1", "c.1", "a.2", "b.2", "c.2")
a <-matrix (c(1,2,3,4, 5, 6, 7), 2,7)
colnames (a)<-varnames
df<-as.data.frame (a)
ID a.1 b.1 c.1 a.2 b.2 c.2
1 1 3 5 7 2 4 6
2 2 4 6 1 3 5 7
I would like to categorize "a.2", "b.2", and "c.2" columns using quartiles of "a.1", "b.1", and "c.1", correspondingly:
cat.a.2<-cut(df$a.2, c(-Inf, quantile(df$a.1), Inf))#categorizing a.2 using quartiles of a.1
cat.a.2
[1] (-Inf,3] (-Inf,3]
Levels: (-Inf,3] (3,3.25] (3.25,3.5] (3.5,3.75] (3.75,4] (4, Inf]
cat.b.2<-cut(df$b.2, c(-Inf, quantile(df$b.1), Inf))# categorizing b.2 using quartiles of b.1
cat.b.2
[1] (-Inf,5] (-Inf,5]
Levels: (-Inf,5] (5,5.25] (5.25,5.5] (5.5,5.75] (5.75,6] (6, Inf]
cat.c.2<-cut(df$c.2, c(-Inf, quantile(df$c.1), Inf))# categorizing c.2 using quartiles of c.1
cat.c.2
[1] (5.5,7] (5.5,7]
Levels: (-Inf,1] (1,2.5] (2.5,4] (4,5.5] (5.5,7] (7, Inf]
Is there any way to do this task automatically?
I naively experimented with sapply ():
quant.vars<-c("a.1","b.1", "c.1") # creating a vector of the names of variables which quartiles I am going to use
vars<-c("a.2","b.2", "c.2") # creating a vector of the names of variables which I am going to categorize
sapply (vars,FUN=function (x){cut (df [,x], quantile (df[,quant.vars], na.rm=T))})
a.2 b.2 c.2
[1,] "(1,3.25]" "(3.25,4.5]" "(5.75,7]"
[2,] "(1,3.25]" "(4.5,5.75]" "(5.75,7]"
Of course, it is not the result I wanted.
Moreover, when add "Inf" to the cut () range I see the following error:
sapply (vars,FUN=function (x){cut (df [,x], c(quantile (df[,quant.vars], Inf), na.rm=T))})
Error in quantile.default(df[, quant.vars], Inf) : 'probs' outside [0,1]
In summary, my question is how to make R:
Calculate quantiles of variables having suffix 1 (a.1., b.1, c.1)
Recognize pairs of variables having common prefix (a.1 and a.2, b.1 and b.2, c.1 and c.2)
In each pair, to categorize the variable having suffix 2, using quantiles, obtained from the variable having suffix 1 (a.2 categorized by a.1 quantiles, b.2 categorized by b.1 quantiles, c.2 categorized by c.1 quantiles)
Thank you very much