0

I have 2 columns of data with numerical values as follows which is tab delimited format:

Si1     Si2

8,99691 7,495936
7,7164173   8,092645
4,4428697   4,298263
7,4302206   7,189521
5,897344    5,316047
. 
.
.

To calculate correlation between these I wrote R code as follows:

int<-read.table("data.txt",sep="\t",head=TRUE)
attach(int)
cor(int$Si1,int$Si2)

But it shows error as follows:

Error in cor(int$Si1,int$Si2) : 'x' must be numeric

Can anybody tell me how to solve this?

Mikko
  • 7,530
  • 8
  • 55
  • 92
nit
  • 689
  • 2
  • 9
  • 20
  • 1
    Here's a question that deals with using commas as decimal separators - http://stackoverflow.com/questions/6123378/how-to-read-in-numbers-with-a-comma-as-decimal-separator – Jesse Anderson May 02 '12 at 19:42

2 Answers2

7

You'll need to write read.table("data.txt",sep="\t",header=TRUE, dec = ",") at least. Your data has comma as a decimal separator. R assumes a period (.).

Mikko
  • 7,530
  • 8
  • 55
  • 92
1

To calculate a correlation between two vectors they must be numeric (numbers). You have commas in your data and so they are not numeric.

Are they meant to be there? This works fine:

x<-c(1,2,3,4,5)
y<-c(1,2,3,4,5)
cor(x,y)

returns [1] 1
user1317221_G
  • 15,087
  • 3
  • 52
  • 78