0

I have created a loop which I am using to select all of the factor variables from a large data frame. However the loop does not work because the use of a counter to index the column for some reason doesn't read as a factor. Why is this? And how can I get this loop to work:

#Here's the loop:
y <- data.frame(c = 1:2661)
for (i in 1:ncol(x)){
  ifelse(is.factor(x[i]) == FALSE, y <- cbind(y, x[i]), y <- y)
}

And the problem is clearly in the use of "i" to reference the column. For example:

#sample data 
df <- as.data.frame(structure(c(2L, 2L, 2L, 2L, 2L), 
                          .Label = c("female", "male"), class = "factor"))

names(df)[names(df) == "structure(c(2L, 2L, 2L, 2L, 2L), .Label = c(\"female\", \"male\"), class = \"factor\")"] <- "var"

#reference the column name directly
is.factor(df$var)
[1] TRUE
#use a counter to access the variable:
i <- 1
is.factor(df[i])
[1] FALSE

Is this something to do with R or is there something up with my PC? If it is something to do with R, can anyone explain what is going on and how to get my loop to function?

lmo
  • 37,904
  • 9
  • 56
  • 69
pd441
  • 2,644
  • 9
  • 30
  • 41
  • If you want to see if your i-th column is factor, just write `is.factor(df[,i])`. Rmember to write the comma to assing the column and not the element. – R18 Oct 09 '17 at 12:11
  • with df[i] you get a data.frame, with df[,i] you will "drop" your column to a primitive type. Type -> ?"[" <- in R to learn more. – Andre Elrico Oct 09 '17 at 12:25
  • [This post](https://stackoverflow.com/questions/36777567/is-there-a-logical-way-to-think-about-list-indexing/36815401) might also be worth reading. – lmo Oct 09 '17 at 13:19

1 Answers1

1

You just need to change how you acess the object. Since you're using a data frame, you have two options to acess the columns

1.

i <- 1
is.factor(df[,i]) 

2.

i <- 1
is.factor(df[[i]])
  • Thanks! That's the solution. Could you also maybe explain what was going on? – pd441 Oct 09 '17 at 12:23
  • @stevezissou Read `help("[")`. – Roland Oct 09 '17 at 12:23
  • As @Roland said help("[") have the answers. But to answer your question quickly, in my answer, 1º you are treating df as a matrix where df[index row, index column] and 2º you are treating df as data frame where df[[column]]. Other forms to select columns is df$namecolumn and df[["namecolumn"]]. – Andryas Waurzenczak Oct 09 '17 at 17:57