1

I want to be able to count the number of NA's that appear in a row in specified columns. From my data below, I'd like to be able to count the NA's rowwise that appear in first, last, address, phone, and state columns (exlcuding m_initial and customer in the count).

    first   m_initial     last         address            phone      state  customer 
    Bob         L         Turner       123 Turner Lane    410-3141   Iowa   NA        
    Will        P         Williams     456 Williams Rd    491-2359   NA     Y        
    Amanda      C         Jones        789 Haggerty       NA         NA     Y        
    Lisa        NA        Evans        NA                 NA         NA     N        

Desired output:

    first   m_initial   last       address            phone      state  customer na_count 
    Bob     L           Turner     123 Turner Lane    410-3141   Iowa   NA       0 
    Will    P           Williams   456 Williams Rd    491-2359   NA     Y        1
    Amanda  C           Jones      789 Haggerty       NA         NA     Y        2
    Lisa    NA          Evans      NA                 NA         NA     N        3  
bodega18
  • 596
  • 2
  • 13

3 Answers3

4
df$na_count <- rowSums(is.na(df[c('first', 'last', 'address', 'phone', 'state')])) 

df
   first m_initial     last         address    phone state customer na_count
1    Bob         L   Turner 123 Turner Lane 410-3141  Iowa     <NA>        0
2   Will         P Williams 456 Williams Rd 491-2359  <NA>        Y        1
3 Amanda         C    Jones    789 Haggerty     <NA>  <NA>        Y        2
4   Lisa      <NA>    Evans            <NA>     <NA>  <NA>        N        3
Onyambu
  • 67,392
  • 3
  • 24
  • 53
3

Base R:

Similar to Onyambu solution using not rowSums instead using apply and applying sum(is.na(x) after subsetting with df[,c(1,3:6]

df$na_count <- apply(df[,c(1,3:6)], 1, function(x) sum(is.na(x)))

dplyr

library(dplyr)
df %>%  
  mutate(na_count = rowSums(is.na(select(., -c(m_initial, customer)))))

Output:

   first m_initial     last         address    phone state customer na_count
1    Bob         L   Turner 123 Turner Lane 410-3141  Iowa     <NA>        0
2   Will         P Williams 456 Williams Rd 491-2359  <NA>        Y        1
3 Amanda         C    Jones    789 Haggerty     <NA>  <NA>        Y        2
4   Lisa      <NA>    Evans            <NA>     <NA>  <NA>        N        3
TarJae
  • 72,363
  • 6
  • 19
  • 66
2
library(tidyverse)

df %>%
  rowwise() %>%
  mutate(na_count = sum(is.na(c_across(all_of(c("first", "last", "address", "phone", "state"))))))
#> # A tibble: 4 × 8
#> # Rowwise: 
#>   first  m_initial last     address         phone    state customer na_count
#>   <chr>  <chr>     <chr>    <chr>           <chr>    <chr> <chr>       <int>
#> 1 Bob    L         Turner   123 Turner Lane 410-3141 Iowa  <NA>            0
#> 2 Will   P         Williams 456 Williams Rd 491-2359 <NA>  Y               1
#> 3 Amanda C         Jones    789 Haggerty    <NA>     <NA>  Y               2
#> 4 Lisa   <NA>      Evans    <NA>            <NA>     <NA>  N               3

Created on 2022-01-04 by the reprex package (v2.0.1)

Data:

structure(list(first = c("Bob", "Will", "Amanda", "Lisa"), m_initial = c("L", 
"P", "C", NA), last = c("Turner", "Williams", "Jones", "Evans"
), address = c("123 Turner Lane", "456 Williams Rd", "789 Haggerty", 
NA), phone = c("410-3141", "491-2359", NA, NA), state = c("Iowa", 
NA, NA, NA), customer = c(NA, "Y", "Y", "N")), row.names = c(NA, 
-4L), class = c("tbl_df", "tbl", "data.frame"))
jpdugo17
  • 6,816
  • 2
  • 11
  • 23