I have a dataframe (data.table) I want to remove all columns where all values are equal to zero. I have read "Remove columns from dataframe where ALL values are NA" but doesn't help me much. My dataset has multiple columns over 3000. This reproducible is data.frame but how to tackle the same for data.table
Asked
Active
Viewed 91 times
-3
-
4Please show what you have tried after posting several somewhat similar questions recently – talat Jan 22 '16 at 13:52
-
And note that none of the columns in your example has only zeros in it. – talat Jan 22 '16 at 13:56
-
Column no.3 has three zeros and rest NAs – Aquarius Jan 22 '16 at 13:57
-
1right, and `NA != 0` – talat Jan 22 '16 at 13:58
-
Sorry didn't get you – Aquarius Jan 22 '16 at 13:59
-
NA is not the same as zero – lebatsnok Jan 22 '16 at 14:00
-
Nope NA is not available data where 0 is resultant of two columns being divided – Aquarius Jan 22 '16 at 14:01
-
Dou you want to remove the columns containing only zeros (you have no such columns) or only zeros and NA's? And do you need a solution for data.frame or data.table? – lebatsnok Jan 22 '16 at 14:02
-
I'm removing the data.table tag, since your example data is, in fact, not a data.table. This is the standard dupe for the data.table question, though: http://stackoverflow.com/questions/9202413/how-do-you-delete-a-column-in-data-table – Frank Jan 22 '16 at 14:39
1 Answers
1
You can try something like this if you want to get rid of all columns that have all NA's or Zeroes. You can modify the condition accordingly if you want NAs only or zeroes only:
df <- df[, sapply(df, function(x) !all(is.na(x) | x == 0))]

Gopala
- 10,363
- 7
- 45
- 77