0

I would like to know how to find the number of missing values in a column using apply and is.na. the result should look like the image below.

1

MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • 2
    Please provide the data using `dput()` in your post. – Ed_Gravy Nov 17 '22 at 19:45
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Please [do not post code or data in images](https://meta.stackoverflow.com/q/285551/2372064) – MrFlick Nov 17 '22 at 19:52
  • Does this answer your question? [How to count number of rows with NA on each column?](https://stackoverflow.com/questions/63924532/how-to-count-number-of-rows-with-na-on-each-column) – Martin Gal Nov 17 '22 at 19:56
  • Try `sapply(df, \(x) sum(is.na(x)))`, where `df` is the name of your data.frame. – Martin Gal Nov 17 '22 at 19:57
  • try `colSums(is.na(df))` – Onyambu Nov 17 '22 at 20:10

1 Answers1

1

Using some sample data since I don't have access to imgur so, I can't see the data provided, thus we can use:

Method 1

library(tidyverse)

# Create dummy data
id = c(1,2,3,4,5,6)
val = c(1,2,NA, NA, NA, NA)

df = data.frame(id, val)

# Count NAs 
df %>% summarise_all(~ sum(is.na(.)))

Output:

   id val
1  0   4

Method 2

According to the comment below, with dplyr:

library (dplyr)

df %>% summarise(across(everything(), ~ sum(is.na(.x))))
Ed_Gravy
  • 1,841
  • 2
  • 11
  • 34
  • 1
    I guess with the newer dplyr, you can use `df %>% summarise(across(everything(), ~ sum(is.na(.x))))` or in `base R` with `colSums(is.na(df))` – akrun Nov 17 '22 at 20:02
  • Thank you sir, will edit my answer based on your comment. – Ed_Gravy Nov 17 '22 at 20:03
  • how would you do "df %>% summarise_all(~ sum(is.na(.)))" with the apply function and to find the missing values of the rows in the data frame? – Saba Al shawa Nov 24 '22 at 23:06
  • how would you do "df %>% summarise_all(~ sum(is.na(.)))" to find the number of missing values in a row using apply and is.na? furthermore what does the "." in the parenthesis in is.na(.) do? is there a different variable we can apply to that so that it finds th emssing variables for the rows? – Saba Al shawa Nov 24 '22 at 23:18
  • `sapply(df, function(x) sum(is.na(x)))` with the `apply` function. And the dot `.` represents respective columns. – Ed_Gravy Nov 25 '22 at 01:38
  • 1
    hello, this sapply(df, function(x) sum(is.na(x))) allows me to find the number of missing values for the columns, how would i do something similar but for the rows? – Saba Al shawa Nov 26 '22 at 00:58
  • `rowSums(is.na(df))` – Ed_Gravy Nov 26 '22 at 05:51