R. Conversion of factor from dataframe into numeric format only works for complete column, not subset

Question

I would like to use a T test for a variable (a) that can be separated into two different groups (1 and 2) by a grouping variable from column c (1 and 2) and a Shapiro test for normal distribution. First I defined the two samples for the t test:

a_group1 <- data[data$c == 1,"a", header = FALSE]
a_group2 <- data[data$c == 2,"a", header = FALSE]

Next, I performed the t test, which worked fine:

t.test(x = a_group1, y = a_group2, paired = FALSE, var.equal = TRUE)

Then I wanted to test for normal distribution using the Shapiro test, but R declared that the factors were not numeric. Therefore, I converted columns a and c into numeric by

data$a <- as.numeric(as.character(data$y2))
data$c <- as.numeric(as.character(data$c1))

Using the class() returns "numeric" for both subsets.

The Shapiro test works fine for the complete column a, and I can also use the hist() function without trouble.

However, if I want to test the subsets (groups 1 and 2 with the grouping variables from column c)

shapiro.test(a_group1)

I receive an error that

is.numeric(x) is not TRUE

shapiro.test(as.numeric(a_group1))

Returns

'list' object cannot be coerced to type 'double'

The complete column a seems to be "numeric", however, not the grouped subsets; and I am lost with the last error.

I'll be happy to provide any additional information!

What package do you have loaded that makes the following code run? `data[data$c == 1,"a", header = FALSE]` I get "Error in `[.data.frame`(data, data$c == 1, "a", header = FALSE) : unused argument (header = FALSE)" — Ian Campbell, May 08 '21 at 14:13
See here . Or `dput()` your data for a minimal reproducible example. — TarJae, May 08 '21 at 14:18

R. Conversion of factor from dataframe into numeric format only works for complete column, not subset

0 Answers0