0

I am trying to write a "for" loop that iterates through each column in a data.table and return a frequency table. However, I keep getting an error saying:

library(datasets)
data(cars)
cars <- as.data.table(cars)
for (i in names(cars)){
  print(table(cars[,i]))
}

Error in `[.data.table`(cars, , i) : 
j (the 2nd argument inside [...]) is a single symbol but column name 'i' is not found. Perhaps you intended DT[, ..i]. This difference to data.frame is deliberate and explained in FAQ 1.1.

When I use each column individually like below, I do not have any problem:

> table(cars[,dist])

  2   4  10  14  16  17  18  20  22  24  26  28  32  34  36  40  42  46  48  50  52  54  56  60  64  66 
  1   1   2   1   1   1   1   2   1   1   4   2   3   3   2   2   1   2   1   1   1   2   2   1   1   1 
 68  70  76  80  84  85  92  93 120 
  1   1   1   1   1   1   1   1   1 

My data is quite large (8921483x52), that is why I want to use the "for" loop and run everything at once then look at the result.

I included the cars dataset (which is easier to run) to demonstrate my code.

If I convert the dataset to data.frame, there is no problem running the "for" loop. But I just want to know why this does not work with data.table because I am learning it, which work better with large dataset in my belief.

If by chance, someone saw a post with an answer already, please let me know because I have been trying for several hours to look for one.

Nguyen
  • 9
  • 1
  • 5
  • the error message is pretty clear about what you should be doing? – MichaelChirico Feb 23 '19 at 06:39
  • I read the FAQ 1.1, it explained that for data.table, we use cars[,dist], instead of cars[,"dist"], without the quotes. It does not say anything else. I tried to unquote values in names(cars), still not working. What I can't understand is why "i" does not take values in names(cars) and pass those values to cars[,i] in the for loop. – Nguyen Feb 23 '19 at 06:54
  • Your best option is to not use a `for` loop at all but use the `apply` function. I think the issue is that `i` is becoming the character vector of the name so it would be cars[,"dist"]. – George Feb 23 '19 at 07:26
  • "Perhaps you intended DT[, ..i]" – MichaelChirico Feb 23 '19 at 07:38

1 Answers1

0

Some solution found here

My personal preference is the apply function though

library(datasets)
data(cars)
cars <- as.data.table(cars)
apply(cars,2,table)

To make your loop work you tweak the i

library(datasets)
data(cars)
cars <- as.data.table(cars)
for (i in names(cars)){
  print(table(cars[,(i) := as.character(get(i))]))
}
George
  • 903
  • 8
  • 22
  • Thank you so much for answering! I have never thought of using apply since I am used to using "for" loop with data.frame. I guess overall, for data.table, apply works better and more intuitive. – Nguyen Feb 23 '19 at 21:33