0

I have a data frame with 30k observations but i think one of my columns has only NA values. How to check if that specific column has only NA values or not because having so much observation i can check them without code.

Beck
  • 115
  • 6

3 Answers3

1

We can try comparing the NA count against the total row count, for each column, and return a value of 1 should those counts be equal. Then, subset the names of the data frame, retaining only those columns having all NA values.

names(df)[sapply(df, function(x) sum(is.na(x)) == length(x))]

[1] "v3"

Data:

df <- data.frame(v1=c(1,2,3), v2=c(4,NA,6), v3=c(NA,NA,NA))
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
0

In Python if your column is df['column_name']

Then you can use this if statement:

if df['column_name'].isnull().sum() == df['column_name'].shape[0]:
    print('All nulls')
kotbegemot
  • 71
  • 5
0

Try this one

data <- tibble(
   a = rnorm(1000, 5, 1),
   b = NA,
   c = c(NA, rnorm(999, 10, 5))
 ) 

 data %>% 
   summarise_all(~all(is.na(.)))
# A tibble: 1 x 3
  a     b     c    
  <lgl> <lgl> <lgl>
1 FALSE TRUE  FALSE
jyjek
  • 2,627
  • 11
  • 23