1

I am quite new to R and I am having a problem checking some values for equality. I have a dataframe rt (below), and I wish to check whether the values in column r$V8 are equal to 606.

        V1    V2                V3 V4       V5  V6 V7   V8   V9
710 256225  RAIN  1853-12-26 00:00  1  DLY3208 900  1  606 1001
712 256225  RAIN  1853-12-27 00:00  1  DLY3208 900  1  606 1001
714 256225  RAIN  1853-12-28 00:00  1  DLY3208 900  1  606 1001
716 256225  RAIN  1853-12-29 00:00  1  DLY3208 900  1  606 1001
718 256225  RAIN  1853-12-30 00:00  1  DLY3208 900  1  606 1001
720 256225  RAIN  1853-12-31 00:00  1  DLY3208 900  1  606 1001

    > typeof(rt$V8)

[1] "integer"

    > mode(rt$V8)
[1] "numeric"

    > class(rt$V8)
[1] "factor"

    > rt$V8
[1]  606  606  606  606  606  606
Levels:  606 1530

Test if equal to 606:

    > rt$V8 == 606
[1] FALSE FALSE FALSE FALSE FALSE FALSE

    > as.integer(rt$V8) == as.integer(606)
[1] FALSE FALSE FALSE FALSE FALSE FALSE

I do not understand why these checks return false, I would appreciate any advice please.

Steph Locke
  • 5,951
  • 4
  • 39
  • 77
Flora
  • 11
  • 1
  • 3
    That's not the correct way to convert a `factor` to an `integer`. See [this](http://stackoverflow.com/questions/23206700/sum-on-a-factor-column-returns-incorrect-result/23206762#23206762) for a detailed answer. – ilir Apr 30 '14 at 10:35
  • Duplicate. Not your fault, but I would have tried searching for `factor` or looking at the documentation. You should use `as.numeric(as.character(rt$V8)) == 606`. – Hugh Apr 30 '14 at 10:59
  • Thanks for your answers. Apologies for posting a duplicate. – Flora Apr 30 '14 at 12:46

1 Answers1

0

I have encountered the same issue multiple times and the real problem is usually how the data is imported in R. If you are using read.csv or similar function there is an attribute called 'colClasses' which is immensely useful. You can tell R using this attribute what the type of each column is and then R will not convert your numeric columns into factors.

An easy example is shown here :

Specifying colClasses in the read.csv

Community
  • 1
  • 1
user3585718
  • 396
  • 1
  • 4
  • Even if the data is generated in R rather than being imported, it's (e.g. `data.frame`) still going to default string-like things to factors. So the `as.character` comments are probably the safest option – Gavin Kelly Apr 30 '14 at 11:37
  • I am importing the data, so I will try this - thanks. – Flora Apr 30 '14 at 12:47