I have a dataset containing several variables and I wish to statistically test the variances (Kruskal-test) for each variable seperately.
My data (df) looks like that: (carbon and nitrogen content for diffrent agricultural managements (see name)). I have 16 groups (to simplify it, I´d say, I have got 8 groups):
extract of the data
1. List item
name N_cont C_cont agriculture
C_ero 1,064 8,380 1
C_ero 0,961 8,086 1
C_ero 0,977 8,331 1
Ds_ero 1,767 17,443 2
Ds_ero 1,802 18,264 2
Ds_ero 2,083 20,112 2
Ms_ero 1,547 14,380 3
Ms_ero 1,566 15,313 3
Ms_ero 1,505 14,760 3
Md_ero 1,512 14,303 4
Md_ero 1,656 15,331 4
Md_ero 1,500 13,788 4
C_upsl 1,121 10,581 5
C_upsl 1,159 10,460 5
C_upsl 1,223 10,171 5
Ds_upsl 1,962 20,656 6
Ds_upsl 1,784 16,780 6
Ds_upsl 1,720 17,482 6
Ms_upsl 1,578 16,228 7
Ms_upsl 1,634 15,331 7
Ms_upsl 1,394 13,419 7
Md_upsl 1,286 11,824 8
Md_upsl 1,241 11,452 8
Md_upsl 1,317 11,932 8
I already put a factor for the agriculture
df$agriculture<-factor(df$agriculture)
I can do statistical tests compairing all of the 16 groups.
e.g. kruskal.test(df$C,df$agriculture)
But now I would like to do statistic tests just for specific groups out of the 8 groups, e.g. those which contain e.g. an C
(Conventional) or rather DS
(Direct seeding) in the name
column
or e.g. ero
(eroding site) or upsl
(upper slope)
It did try grep
or split
, but it did not work, because the dimension of x and y should be the same.
Do you have any clue?