0

I am working with readHTMLTable and am having difficulties performing calculations on the columns, as when I convert to numeric with as.numeric the values in the column are changed from values to rank. Can anyone help

a=readHTMLTable("http://www.nhl.com/ice/standings.htm?season=20132014&type=LEA",which=3,trim=F)
> a[,5]
 [1] 54 54 52 52 51 51 46 46 46 46 43 45 42 43 39 40 38 37 38 35 37 37 38 36 36 34 35 29 29 21
Levels: 21 29 34 35 36 37 38 39 40 42 43 45 46 51 52 54
> a[,5]=as.numeric(a[,5])
> a[,5]
 [1] 16 16 15 15 14 14 13 13 13 13 11 12 10 11  8  9  7  6  7  4  6  6  7  5  5  3  4  2  2  1

I would like to be able to perform functions on the values of a[,5], not the ranks. such as mean(a[,5]) = (54+54+52...+21)/30, not

mean(a[,5])
[1] 8.933333

frank
  • 3,036
  • 7
  • 33
  • 65

1 Answers1

0

The problem is trying to convert a factor variable to numeric. See this post.

The canonical way to handle the problem would be as.numeric(levels(a[,5]))[a[,5]]

However, the method I often use is as.numeric(as.character(a[,5])) because it's easier to remember.

Community
  • 1
  • 1
rsoren
  • 4,036
  • 3
  • 26
  • 37