Count missing values with rowwise and add number of missing values

Question

Hi there: I would like to add a column to a dataframe that looks like this that has the number of missing values in it.

var1<-rnorm(100)
var2<-rnorm(100)
df<-data.frame(var1, var2)
#Set 1 missing value
df[1,1]<-NA
df[2,1]<-NA

library(tidyverse)
df
df %>% 
#I know select is somewhat superflurous in this dataframe, but I need it in my example, so I want to be sure I get ti right
  select(var1, var2) %>% 
  is.na() %>% 
#The missing values are there. 
  head()
#How do I add the counts
df %>% 
  select(var1, var2) %>% 
  rowwise() %>% 
  mutate(na=rowSums(is.na(.)))

score 3 · Answer 1 · answered Oct 14 '20 at 15:33

3

You can use rowSums directly :

library(dplyr)
df %>%  mutate(na = rowSums(is.na(select(., var1, var2))))

answered Oct 14 '20 at 15:33

Ronak Shah

377,200
20
156
213

score 2 · Answer 2 · answered Oct 14 '20 at 15:32

Maybe try this using rowwise() for sure and mutate() to create Var, which will store the number of NA. For row operations you can use c_across() evaluating the desired conditions. Here the code:

library(dplyr)
#Code
newdf <- df %>% rowwise() %>% mutate(Var=sum(is.na(c_across(var1:var2))))

Output:

# A tibble: 100 x 3
# Rowwise: 
      var1    var2   Var
     <dbl>   <dbl> <int>
 1 NA       0.990      1
 2 NA       0.509      1
 3 -0.248  -1.89       0
 4 -0.149  -0.230      0
 5 -0.808   0.421      0
 6  0.216  -1.36       0
 7 -0.319   1.50       0
 8 -0.0418  0.487      0
 9 -3.36   -2.37       0
10 -0.151  -0.0478     0
# ... with 90 more rows

score 1 · Accepted Answer · answered Oct 14 '20 at 15:34

1

You don't need rowwise. Just comment that line and your code works.

This works:

df %>% 
  select(var1, var2) %>% 
  mutate(na = rowSums(is.na(.)))

answered Oct 14 '20 at 15:34

BHudson

687
4
11

When I try that on my underlying data I get this: `Error: Problem with `mutate()` input `na`. x Input `na` can't be recycled to size 1. ℹ Input `na` is `rowSums(is.na(.))`. ℹ Input `na` must be size 1, not 4202. ℹ Did you mean: `na = list(rowSums(is.na(.)))` ? ℹ The error occured in row 1.` I'm working with labelled data. Could that be an issue? – spindoctor Oct 14 '20 at 15:38
Possibly. Could you provide an example of the exact type of data you are working with in the question? Just a couple lines should be sufficient. – BHudson Oct 14 '20 at 15:42
Correction: it works fine. My dataframe had been grouped rowwise in previous code. Ungrouping it worked fine. Wow. Rowwise() appears to be tricky. – spindoctor Oct 14 '20 at 15:47
Ah yes, that would do it. Good catch. Really common when working with grouping to forget to ungroup. Thanks for accepting. – BHudson Oct 14 '20 at 15:48

Count missing values with rowwise and add number of missing values

3 Answers3