0

I indexed a data.frame with a factor index instead of character an received a wrong row. I was expecting to get a warning. How can this be explained?

df<-data.frame(A=1:4, B=2:5, C=3:6, row.names = c("6", "8", "9", "19"))
ci<-row.names(df)
fi<-as.factor(ci)

df
   A B C
6  1 2 3
8  2 3 4
9  3 4 5
19 4 5 6

ci[1]
[1] "6"

fi[1]
[1] 6
Levels: 19 6 8 9

df[ci[1],]
  A B C
6 1 2 3

df[fi[1],]
  A B C
8 2 3 4
dorit
  • 41
  • 5
  • 1
    6 is your second level of factor. –  Oct 15 '15 at 09:07
  • Compare `df[as.character(fi[1]),]` vs `df[as.integer(fi[1]),]`, but I do agree that `factor`s in R are a pain and takes lots of time to get used to. They probably done much more damage than use along the history, especially when it comes [converting them to numeric values](http://stackoverflow.com/questions/3418128/how-to-convert-a-factor-to-an-integer-numeric-without-a-loss-of-information). – David Arenburg Oct 15 '15 at 09:10
  • Try `as.numeric(fi[x])` for different `x`. As pointed out by @Pascal , the levels are indexed in the sequence `19 6 8 9`. By using `df[fi[1],]` you are choosing the second row, because `fi[1]` is equal to the level 6, which is the the second level in the sequence. – RHertel Oct 15 '15 at 09:16
  • 2
    To be more explicit: subsetting is documented to work with characters, integers/numerics or logical values, but *not* with factors. Thus, if you use factors for subsetting, the underlying integers are used. – Roland Oct 15 '15 at 09:20

0 Answers0