removing columns with NA values only

Question

I am using this command to remove the columns where all the values are NA.

testing5 <- subset(testing4,
                   select = -c(kurtosis_picth_belt, skewness_roll_belt, 
                   skewness_roll_belt.1, min_yaw_belt, amplitude_yaw_belt, 
                   kurtosis_roll_arm, kurtosis_picth_arm, kurtosis_yaw_arm, 
                   skewness_roll_arm, skewness_pitch_arm, kurtosis_picth_dumbbell, 
                   skewness_roll_dumbbell, skewness_pitch_dumbbell, min_yaw_dumbbell, 
                   kurtosis_roll_forearm, kurtosis_picth_forearm, skewness_roll_forearm, 
                   skewness_pitch_forearm))

Is there a shorter (programmitic) method?

Thanks and Regards, Partha

Does this answer your question? [Remove columns from dataframe where ALL values are NA](https://stackoverflow.com/questions/2643939/remove-columns-from-dataframe-where-all-values-are-na) — HarmlessEcon, Jul 29 '21 at 17:23

score 5 · Accepted Answer · answered Sep 12 '19 at 17:14

5

The tidyverse approach would look like this (also using @Rich Scriven data):

d %>% select_if(~any(!is.na(.)))
#    x
# 1 NA
# 2  3
# 3 NA

answered Sep 12 '19 at 17:14

Zeta

252
2
5

Rich Scriven · Answer 2 · 2015-11-10T17:28:34.780

4

You can remove the columns that contain all NA values with e.g.

d <- data.frame(x = c(NA, 3, NA), y = rep(NA, 3))
#    x  y
# 1 NA NA
# 2  3 NA
# 3 NA NA

d[!sapply(d, function(x) all(is.na(x)))]
#    x
# 1 NA
# 2  3
# 3 NA

On your data, this would be

testing4[!sapply(testing4, function(x) all(is.na(x)))]

edited Nov 10 '15 at 17:28

answered Sep 18 '14 at 21:20

Rich Scriven

97,041
11
181
245

(+1) for the safer solution :) – David Arenburg Sep 18 '14 at 21:34

score 3 · Answer 3 · answered Sep 18 '14 at 21:27

3

Yet another way (a bit more vectorized) using @Richards data

d[!is.nan(colMeans(d, na.rm = TRUE))]
#    x
# 1 NA
# 2  3
# 3 NA

answered Sep 18 '14 at 21:27

David Arenburg

91,361
17
137
196

2

On that road, perhaps something like `d[colSums(is.na(d)) < nrow(d)]` could be clearer? – alexis_laz Sep 18 '14 at 23:26
@alexis_laz, brilliant suggestion as usual – David Arenburg Sep 19 '14 at 06:03

removing columns with NA values only

3 Answers3