1

Hi I am really new to R and can not for the life of me figure out what I am doing wrong. I try to access a value from my data frame and add it to a different value (all integers not strings to my knowledge). The result of the addition is a NaN. If I first convert the data frame value to an int I get a number but unfortunately the wrong one. When I look more closely at the cell value i get.

> df_folk[20, "X.2"]
[1] 226993
51 Levels: 15-24 ar 196838 197925 197934 199675 200193 200729 200895 200956 201287 202442 202459 203778 204175 204176 204590 204780 207816 208052 208467 209027 ... 259043
> as.integer(df_folk[20, "X.2"])
[1] 38

The first value is what i would expect and is what I see in the data frame. I have no idea what the second value is. Also when I try to check what data type my cell value is I get.

> typeof(df_folk[20, "X.2"])
[1] "integer"

I'm very confused. All I want to do is add a few specific cell values. What is the best way to do this?

Ivar Eriksson
  • 863
  • 1
  • 15
  • 30
  • 3
    the value is actually a `factor`, do `as.integer(as.character(df_folk$X.2[20]))` – Ronak Shah Dec 09 '18 at 14:05
  • Thanks, that worked. Do you care to explain why, what's a factor? – Ivar Eriksson Dec 09 '18 at 14:06
  • 2
    Your variable, for whatever reason, is stored as a factor. Have a read of `?factor` if you want to learn more. – lmo Dec 09 '18 at 14:07
  • 2
    In R, `factor` is used to encode strings in an *efficient* manner. In fields where strings are repeated frequently (e.g., gender, state/country), it can make sense for object storage as well as processing speed for the strings to be stored once (each) and referenced with integer indices on those unique strings. The unique strings are known as `levels` of a `factor`, and in this case there are 51 unique strings, and `df_folk[20,"X.2"]` is the 38th of those 51. – r2evans Dec 09 '18 at 14:08

0 Answers0