0

I downloaded the R package RVAideMemoire in order to use the G.test.

    > head(bio)
      Date   Trt Treated Control Dead DeadinC AliveinC
    1 23Ap citol       1       3    1       0       13
    2 23Ap cital       1       5    3       1        6
    3 23Ap gerol       0       3    0       0        9
    4 23Ap   mix       0       5    0       0        8
    5 23Ap cital       0       5    1       0       13
    6 23Ap cella       0       5    0       1        4

So, I make subsets of the data to look at each treatment, because the G.test result will need to be pooled for each one.

    datamix<-subset(bio, Trt=="mix")
    head(datamix)
       Date Trt Treated Control Dead DeadinC AliveinC
    4  23Ap mix       0       5    0       0        8
    8  23Ap mix       0       5    1       0        8
    10 23Ap mix       0       2    3       0        5
    20 23Ap mix       0       0    0       0       18
    25 23Ap mix       0       2    1       0       15
    28 23Ap mix       0       1    0       0       12

So for the G.test(x) to work if x is a matrix, it must be constructed as 2 columns containing numbers, with 1 row per population. If I use the apply() function I can run the G,test on each row if my data set contains only two columns of numbers. I want to look only at the treated and control for example, but I'm not sure how to omit columns so the G.test can ignore the headers, and other columns. I've tried using the following but I get an error:

    apply(datamix, 1, G.test)
    Error in match.fun(FUN) : object 'G.test' not found

I have also thought about trying to use something like this rather than creating subsets.

    by(bio, Trt, rowG.test)

The G.test spits out this, when you compare two numbers.

    G-test for given probabilities
    data:  counts
    G = 0.6796, df = 1, p-value = 0.4097

My other question is, is there someway to add all the df and G values that I get for each row (once I'm able to get all these numbers) for each treatment? Is there also some way to have R report the G, df and p-values in a table to be summed rather than like above for each row?

Any help is hugely appreciated.

user3605723
  • 27
  • 1
  • 4
  • 2
    First, read this: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example. Then, include your dataset or at least a representative sample, in you question. Then, show the code you tried and what happened. – jlhoward May 05 '14 at 20:42
  • 1
    Thank you for the tips! My apologies for the confusion. I hope it makes more sense now. – user3605723 May 06 '14 at 18:14

1 Answers1

0

You're really close. This seems to work (hard to tell with such a small sample though).

by(bio,bio$Trt,function(x)G.test(as.matrix(x[,3:4])))

So first, the indices argument to by(...) (the second argument) is not evaluated in the context of bio, so you have to specify bio$Trt instead of just Trt.

Second, this will pass all the columns of bio, for each unique value of bio$Trt, to the function specified in the third argument. You need to extract only the two columns you want (columns 3 and 4).

Third, and this is a bit subtle, passing x[,3:4] to G.test(...) causes it to fail with an unintelligible error. Looking at the code, G.test(...) requires a matrix as it's first argument, whereas x[,3:4] in the code above is a data.frame. So you need to convert with as.matrix(...).

jlhoward
  • 58,004
  • 7
  • 97
  • 140
  • I tried this, and I get NA for the G and p-value? I think the error is a result of the 0's in the matrix. The G.test won't work if x is not positive. Do you know of a way to fix this error? – user3605723 May 08 '14 at 13:41
  • You'll get `NA` if there is only one row in a treatment group, since under those conditions the G.test is meaningless. That is the case most of the time in your sample. – jlhoward May 08 '14 at 15:39
  • In the full data set there are about 12 rows per treatment. Even so, if I had a 1 in the Treatment column and 5 in the Control column, R does return a G and p-value for that even though its only one row in the treatment group...However, if I have 0 and 5 R returns an error. – user3605723 May 09 '14 at 12:57