0

I'm doing an assignment for university and have copied and pasted the R code so I know it's right but I'm still not getting any P or F values from my data:

Food    Temperature Area
50         11     820.2175
100        11     936.5437
50         14    1506.568
100        14    1288.053
50         17   1692.882
100        17   1792.54

This is the code I've used so far:

aovdata<-read.table("Condition by area.csv",sep=",",header=T)
attach(aovdata)
Food <- as.factor(Food) ; Temperature <- as.factor(Temperature)
summary(aov(Area ~ Temperature*Food))

but then this is the output:

                  Df Sum Sq Mean Sq
Temperature       2 757105  378552
Food              1      1       1
Temperature:Food   2  35605   17803

                                                                                   

Any help, especially the code I need to fix it, would be great. I think there could be a problem with the data but I don't know what.

Katie
  • 1
  • 1
  • 2
    I don't think those continuous-looking variables should be factors. *Maybe* food, but certainly not temperature. – Gregor Thomas Dec 08 '20 at 14:53
  • 2
    Also, I'd strongly advise against using `attach`. It's almost universally maligned for causing more problems than it helps. R code that uses attach looks very dated, better methods have been popular for 10-15 years. – Gregor Thomas Dec 08 '20 at 14:54
  • Thanks for commenting quickly. I just use attach since I copy and paste the code as I said and my lecturer is pretty old and it's worked before but any advice of what to use instead would be great. Would you recommend using categorical variables instead? If so, I don't know how to replace those two since they were the key ones in the experiment we're analysing (how temperature and food levels influence fungal growth). (I'm also new and pretty clueless with R if you couldn't tell) – Katie Dec 08 '20 at 15:10
  • `factor` is the class R uses for categorical variables. Temperature is generally continuous, not categorical, and your data looks like you have continuous values. I'm not sure what `Food 50` and `Food 100` means so I can't tell you whether that should be continuous (`numeric`) or categorical (`factor`). – Gregor Thomas Dec 08 '20 at 15:26
  • 1
    As for `attach` alternatives, `dplyr`, many useful functions have `data` arguments so you don't need to `attach` the data. We have an old FAQ [Why is it not advisable to use attach](https://stackoverflow.com/q/10067680/903061) - it could probably use some updating. `dplyr` is hugely popular and is built in a way that `attach` is pointless with it. Even here, `aov` has a data argument, so `summary(aov(Area ~ Temperature*Food, data = aovdata))` will work without `attach`, and will use the columns from your data frame. It's good to use the data frame so you can be more sure columns stay in sync. – Gregor Thomas Dec 08 '20 at 15:29

2 Answers2

1

I would do this. Be aware of difference between factor and continues predictors.

library(tidyverse)

df <- sapply(strsplit(c("Food    Temperature Area", "50         11     820.2175", "100        11     936.5437", 
                  "50         14    1506.568", "100        14    1288.053", "50         17   1692.882", 
                  "100        17   1792.54")," +"), paste0, collapse=",") %>% 
  read_csv()

model <- lm(Area ~ Temperature * as.factor(Food),df)

summary(model)
#> 
#> Call:
#> lm(formula = Area ~ Temperature * as.factor(Food), data = df)
#> 
#> Residuals:
#>      1      2      3      4      5      6 
#> -83.34  25.50 166.68 -50.99 -83.34  25.50 
#> 
#> Coefficients:
#>                                Estimate Std. Error t value Pr(>|t|)  
#> (Intercept)                    -696.328    505.683  -1.377    0.302  
#> Temperature                     145.444     35.580   4.088    0.055 .
#> as.factor(Food)100               38.049    715.144   0.053    0.962  
#> Temperature:as.factor(Food)100   -2.778     50.317  -0.055    0.961  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 151 on 2 degrees of freedom
#> Multiple R-squared:  0.9425, Adjusted R-squared:  0.8563 
#> F-statistic: 10.93 on 3 and 2 DF,  p-value: 0.08498

ggeffects::ggpredict(model,terms = c('Temperature','Food')) %>% plot()

Created on 2020-12-08 by the reprex package (v0.3.0)

Magnus Nordmo
  • 923
  • 7
  • 10
0

The actual problem with your example is not that you're using factors as predictor variables, but rather that you have fitted a 'saturated' linear model (as many parameters as observations), so there is no variation left to compute a residual SSQ, so the ANOVA doesn't include F/P values etc.

It's fine for temperature and food to be categorical (factor) predictors, that's how they would be treated in a classic two-way ANOVA design. It's just that in order to analyze this design with the interaction you need more replication.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453