0

I'm trying to pass all the numbers in a column of a dataframe to be analysed using wilcox.test(·) in R.

If I pass c(1,2,3) it works just fine, but I want to pass a column from a pre-existing database into the function without typing it out completely. (There are ~2million rows)

Passing the column gives the error : 'x' must be numeric. (Understandably so)

Sample data:

    AA      AC       AD          AE         AF
 0.6047619  NA  -1.0000000   1.0059524  -1.000000
-0.2348790  NA   0.5812500   0.1294643  -1.000000
 0.9523810  -1  -1.0000000  -1.0000000  -1.000000

Statement used:

{print(wilcox.test(list, y = NULL, correct = TRUE, mu = 0, exact = NULL))}

Error message :

    Error in wilcox.test.default(list, y = NULL, correct = TRUE, mu = 0, exact = NULL) : 
  'x' must be numeric

List is one column of the dataframe. From AA to 0.9523810.

Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138
user2211355
  • 57
  • 1
  • 3
  • 7

2 Answers2

1

If list is your data frame, you can obtain the results for each column with the following:

apply(list,2,wilcox.test, y = NULL, correct = TRUE, mu = 0, exact = NULL)

You're getting an error because one of your columns is not a numeric variable.

Thomas
  • 43,637
  • 12
  • 109
  • 140
  • I tried removing the first column using names(list). It didn't change anything, I'm still getting the same error message. – user2211355 Jun 04 '13 at 14:30
  • Are all of the columns in your dataframe numeric? It seems not. Find the ones that are not numeric and remove them and then this will work. – Thomas Jun 04 '13 at 14:35
1

To specify a given column in a data frame named df, you can use one of the following:

df[1, ]     # by number: first column
df["x1", ]  # by name: column that is named "x1"
df$x1       # also by name

So in this case, you would use (if you wanted the column named "AA")

wilcox.test(df$AA, y=NULL, correct=TRUE, mu=0, exact=NULL)
Hong Ooi
  • 56,353
  • 13
  • 134
  • 187