11

I'm trying to write a function that turns empty strings into NA. A summary of one of my column looks like this:

      a   b 
 12 210 468 

I'd like to change the 12 empty values to NA. I also have a few other factor columns for which I'd like to change empty values to NA, so I borrowed some stuff from here and there to come up with this:

# change nulls to NAs
nullToNA <- function(df){

  # split df into numeric & non-numeric functions
  a<-df[,sapply(df, is.numeric), drop = FALSE]
  b<-df[,sapply(df, Negate(is.numeric)), drop = FALSE]

  # Change empty strings to NA
  b<-b[lapply(b,function(x) levels(x) <- c(levels(x), NA) ),] # add NA level
  b<-b[lapply(b,function(x) x[x=="",]<- NA),]                 # change Null to NA

  # Put the columns back together
  d<-cbind(a,b)
  d[, names(df)]
}

However, I'm getting this error:

> foo<-nullToNA(bar)  
Error in x[x == "", ] <- NA : incorrect number of subscripts on matrix  
Called from: FUN(X[[i]], ...)

I have tried the answer found here: Replace all 0 values to NA but it changes all my columns to numeric values.

Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
Travis Heeter
  • 13,002
  • 13
  • 87
  • 129
  • why not the `is.null()` function instead of `x==""`? Maybe there is nothing to be found. Have you checked whether your `levels` returns anything. you can check the inside of your function step by step with your data. ignore the function and go line by line with your data the inside of your function. – A.Yazdiha Nov 02 '16 at 11:56
  • Possible duplicate of [Replace all 0 values to NA](http://stackoverflow.com/questions/11036989/replace-all-0-values-to-na) – Mateusz1981 Nov 02 '16 at 12:04

4 Answers4

11

You can directly index fields that match a logical criterion. So you can just write:

df[is_empty(df)] = NA

Where is_empty is your comparison, e.g. df == "":

df[df == ""] = NA

But note that is.null(df) won’t work, and would be weird anyway1. I would advise against merging the logic for columns of different types, though! Instead, handle them separately.


1 You’ll almost never encounter NULL inside a table since that only works if the underlying vector is a list. You can create matrices and data.frames with this constraint, but then is.null(df) will never be TRUE because the NULL values are wrapped inside the list).

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
2

This worked for me

    df[df == 'NULL'] <- NA
AMS
  • 151
  • 1
  • 9
1

How about just:

df[apply(df, 2, function(x) x=="")] = NA

Works fine for me, at least on simple examples.

juod
  • 440
  • 3
  • 8
  • 1
    (1) `""` ≠ `NULL`! (2) `apply` isn’t needed. – Konrad Rudolph Nov 02 '16 at 11:55
  • Agree with (2), I overcomplicated it :) But can you even have NULL values in R vectors?.. Anyway, OP's example function is looking for empty strings, so I figured that's what he wanted to replace. – juod Nov 02 '16 at 12:00
  • Admittedly having `NULL` values in tables is rare. It only works if the underlying (column) vector is a `list`. – Konrad Rudolph Nov 02 '16 at 12:04
  • Not so weird, at least not anymore. The tidyverse function `pivot_wider` puts NULL in for missing values. – PeterK Jan 11 '21 at 17:51
0

This is the function I used to solve this issue.

null_na=function(vector){
  new_vector=rep(NA,length(vector))
  for(i in 1:length(vector))
    if(vector[i]== ""){new_vector[i]=NA}else if(is.na(vector[i])) 
      {new_vector[i]=NA}else{new_vector[i]=vector[i]}
  return(new_vector)
}

Just plug in the column or vector you are having an issue with.