0

I am working with trees that we measure annually and trying to run an Anova on growth and survival compared to tree species and the planting stock type (bareroot, containerized, etc). Some of these individuals experience 'dieback' show they will have negative 'growth' between years and some individuals die.

I keep getting an error message when I try to run an Anova on growth but not on survival which I have set up as binary.

The first image is my data in excel that is saved as an CSV showing how I have put periods in the growth columns for trees that died and negative values for those who experienced dieback.

Data in excel

To get my stuff into R I have been using

    data=read.csv(file.choose()) -- then obviously selecting my .csv
    attach(data)

Then when I run

    htgrowth.aov=aov(HTGrowth~SPECIES*POT)

I receive the following error message.

Error Message from R

However, if I run

    survival.aov=aov(SURVIVAL~SPECIES*POT)
    summary(survival.aov)

I get the expected Anova table that I need and am able to then run a TukeyHSD on it as well.

I am guessing that because I have missing data and negative numbers is the reason that I cannot get it to run.

I have looked around for answers but I cannot figured out how to fix it, I do apologize if this has been addressed and I am just too dumb to find it.

Matt
  • 74,352
  • 26
  • 153
  • 180
TaylorHall
  • 9
  • 1
  • 2
  • 1
    Do not post your data as an image, please learn how to give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610) – Jaap May 10 '16 at 22:14
  • plain anova does not do well with missing values. it works in the second case because there are no missing values (I'd guess, without seeing yoru whole data). Please try looking at `lme` from package `nlme` for linear mixed effects model as an alternative that is robust to missing values and unbalanced designs. Here's a similar answer for a question involving a more complex anova example https://stats.stackexchange.com/questions/37577/repeated-measures-anova-unbalanced-and-missing-values-in-r PS in general there may be better advice on this topic at crossvalidated. – jaimedash May 10 '16 at 22:19
  • 2
    And your life will be much simpler if you unlearn the use of `attach` and learn to distrust the person or book that taught you that error-prone strategy. Furthermore you posted only a _warning_, .... not an error message. Regression methods will accept data with missing values. – IRTFM May 10 '16 at 22:19
  • 1
    R does not treat . (period) as a missing value. The most prudent way is to replace all the periods in your dataset with NA. Let me warn you this is not the right way of doing things but will get you there if you are in hurry. The better way as suggested is to use `read.table` with na.strings = "." if that is how your csv file represents missing value ( . is a very SAS way to represent missing values ) – Gaurav Taneja May 10 '16 at 22:27
  • Thanks for all of the responses. I do apologize for problems in the way I posted my question I will certainly remember to do better next time I have an issue to ask about. @GauravTaneja I did use read.table with na.strings="." and it seems to work perfectly fine now thank you so much! – TaylorHall May 11 '16 at 14:34

0 Answers0