Selecting the highest F value from a looped anova in R

Question

As a part of a project I need to perform anova analysis between the various columns of a csv file. Is there any way I can write a loop to do the anova between the all the columns instead of doing it individually?
Right now I am using the following code.

anova(colx,col1)
anova(colx,col2)
.
.
.
anova(colx,coln)

I want to automate this process and select the columns which give the maximum F value.

one approach would combine `combn()`, `lapply()`, `anova()`, some extraction via `[[` and then searching for the max statistic...without sample data, that's as far as I'm going to go — Chase, Sep 21 '14 at 01:12
What version of `anova()` accepts column names like that? Try to make an actual [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) — MrFlick, Sep 21 '14 at 02:29

rnso · Answer 1 · 2014-09-21T06:44:26.693

1

If ddf is your data frame having all the columns (mtcars as an example here), try:

ddf = mtcars
maxfval=0; a=1; b=1
len= length(ddf)
for(i in 1:len) for(j in 1:len){
    if(i!=j){
        fval = anova(aov(ddf[,i]~ddf[,j]))$F[1]
        if(fval>maxfval) {maxfval=fval; a=i;b=j}
    }
}

cat('\nMax F value=',maxfval, '\nWith columns=',a,',',b,'\n')

Output:

Max F value= 130.9989 
With columns= 3 , 2

edited Sep 21 '14 at 06:44

answered Sep 21 '14 at 02:44

rnso

23,686
25
112
234

Selecting the highest F value from a looped anova in R

1 Answers1