3

Let's start with some data to make the example reproducible:

x <- structure(list(DC1 = c(5, 5, NA, 5, 4, 6, 5, NA, 4, 6, 6, 6, 
5, NA, 5, 5, 7), DC2 = c(4, 7, 4, 5, NA, 4, 6, 4, 4, 5, 5, 5, 
5, NA, 6, 5, 5), DC3 = c(4, 7, 4, 4, NA, 4, 5, 4, 5, 4, 5, 5, 
6, 4, 6, 6, 5), DC4 = c(4, 7, 5, NA, NA, 4, 6, 5, 5, 4, 3, 4, 
6, 5, 5, 6, 3), DC5 = c(7, 8, 5, NA, NA, 10, 7, 6, 8, 6, 6, 7, 
11, 10, 5, 7, 6), DC6 = c(8, 8, NA, NA, NA, 11, 9, 8, 9, 9, 10, 
10, 12, 16, 6, 8, 9), DC7 = c(10, 10, 10, NA, NA, 8, 9, 8, 13, 
8, 11, 9, 14, 13, 8, 8, 11), DC8 = c(17, 10, 10, NA, NA, 10, 
10, 10, 15, 10, 14, 11, 23, 15, 14, 13, 14), DC9 = c(16, 9, 9, 
NA, NA, 12, 13, 11, 13, 15, 15, 13, 17, 15, 25, 17, 12)), .Names = c("DC1", 
"DC2", "DC3", "DC4", "DC5", "DC6", "DC7", "DC8", "DC9"), class = "data.frame", row.names = c(NA, 
-17L))

How can I filter the data frame, keeping rows that contain data from column DC3 to DC10?

antecessor
  • 2,688
  • 6
  • 29
  • 61

3 Answers3

6

Here's a dplyr option:

library(dplyr)
x %>% 
  filter_at(vars(DC3:DC9), all_vars(!is.na(.)))

or:

x %>% 
  filter_at(vars(DC3:DC9), all_vars(complete.cases(.)))

and here's a tidyr option:

x %>% tidyr::drop_na(DC3:DC9)
sbha
  • 9,802
  • 2
  • 74
  • 62
3

We can subset the data and apply complete.cases

x[complete.cases(x[3:9]),]

or using column names

x[complete.cases(x[paste0("DC", 3:9)]),]
akrun
  • 874,273
  • 37
  • 540
  • 662
2

You could use function str_extract from package stringr, which can extract the number from the column name in the data frame.

# get number from column name
col_num <- as.numeric(stringr::str_extract(names(x), "\\d"))

# rows that contain data from column DC3 to DC10
x[(col_num >= 3) & (col_num < 10)]

Edited note: To Install stringr please use install.packages("stringr")

Ferand Dalatieh
  • 313
  • 1
  • 4
  • 14