0

I am trying to perform a difference of means test in R, but I get the following error:

Error in t.test.formula(age ~ fare, data = FARE, var.equal = TRUE) : grouping factor must have exactly 2 levels

This is the exercise question:

Perform an appropriate bivariate statistical test to explore the relationship between age and fare. Provide command(s) to perform the analysis. (Hint: I am not asking you to run a regression.)

This is my code:

td is my data set

FARE <- td[is.na(td$fare) == FALSE & is.na(td$age) == FALSE, ]
by(data = FARE$fare, # This part provides Y (data)
INDICES = FARE$age, # This part provides X (indices)
FUN = summary) # This part provides what you want to do (function).
t.test (age ~ fare, #X and Y specification
data = FARE, #This part provides the data frame
var.equal = TRUE) # This option tells R to assume equal variance.

Thanks

Peter
  • 11,500
  • 5
  • 21
  • 31
  • 4
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input that can be used to test and verify possible solutions. – MrFlick Dec 11 '21 at 19:41
  • 1
    check how many levels `FARE$fare` has. – George Savva Dec 11 '21 at 20:14
  • Do: `table(FARE$fare)` to see the number and values of that variable. Typically fares would have many values so maybe you have chosen the wrong test and should instead be using `lm`, i.e. regression of one continuous variable against the other. Or perhaps set fare to be a factor if it's only got three or five levels. If you do regression of a continuous variable against a factor you get an F-test for equality of means among the variaous levels. – IRTFM Dec 11 '21 at 20:47

0 Answers0