3
Age <- c(90,56,51,'NULL',67,'NULL',51)
Sex <- c('Male','Female','NULL','male','NULL','Female','Male')
Tenure <- c(2,'NULL',3,4,3,3,4)
df <- data.frame(Age, Sex, Tenure)

In the above example, there are 'NULL' values as character/string formate. I am trying to impute NA in place of 'NULL' values. I was able to it for a single column as df$age[which(df$Age=='NULL)]<-NA' However I don't want to write this for all columns.

How to apply similar logic to all columns so that all the 'NULL' values of df are converted to NAs? I am guessing that apply or custom defined function or for loop will do it.

rawr
  • 20,481
  • 4
  • 44
  • 78
Ashish25
  • 1,965
  • 1
  • 15
  • 16
  • Check out my `makemeNA` function, described in [this answer](https://stackoverflow.com/a/29445422/1270695) and available from [here](https://github.com/mrdwab/SOfun). You could then just do `makemeNA(df, "NULL")`. – A5C1D2H2I1M1N2O1R2T1 Dec 28 '17 at 17:31
  • 1
    poss duplicate: https://stackoverflow.com/questions/3357743/replacing-character-values-with-na-in-a-data-frame – user20650 Dec 28 '17 at 17:38

3 Answers3

14

base R solution

replace(df, df =="NULL", NA)
CPak
  • 13,260
  • 3
  • 30
  • 48
4

One can even use to replace in one step:

df[df=="NULL"] <- NA
MKR
  • 19,739
  • 4
  • 23
  • 33
1

We can use dplyr to replace the 'NULL' values in all the columns and then convert the type of the columns with type.convert. Currently, all the columns are factor class (assuming that 'Age/Tenure' should be numeric/integer class)

library(dplyr)
res <- df %>%
         mutate_all(funs(type.convert(as.character(replace(., .=='NULL', NA)))))
str(res)
#'data.frame':   7 obs. of  3 variables:
#$ Age   : int  90 56 51 NA 67 NA 51
#$ Sex   : Factor w/ 3 levels "Female","male",..: 3 1 NA 2 NA 1 3
#$ Tenure: int  2 NA 3 4 3 3 4
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thanks for responding. I am trying to apply this logic to my larger data frame, which also consists a column as date time class which has few 'NULL' values in it. However, I am getting an Error saying ```Error in mutate_impl(.data, dots) : Evaluation error: character string is not in a standard unambiguous format.``` – Ashish25 Dec 28 '17 at 17:42
  • @AshishSahu Assuming that the structure of the data is similar to the example you showed, it should work. Please check the `str(df)` with the `str(yourlargerdata)` to see if there are any differences in class. – akrun Dec 28 '17 at 17:48
  • 1
    I figured it out. I guess, My ```str(myDataFrame)``` had a few columns as Date Time class, which included NULL values in it. So while doing ```mutate_all``` or ```replace``` function, it was throwing Evaluation Error. To overcome this I converted all the column as.character formate and did the imputation later on, which worked for me. – Ashish25 Dec 28 '17 at 17:57
  • 1
    @AshishSahu I guess you have `POSIXlt` column which may not be supported. Instead it should be `POSIXct` – akrun Dec 28 '17 at 17:58