8

I am quite new to R and I am working on a data frame with several NULL values. So far I am not able to replace those, and I can't wrap my head about a solution so it would be amazing if anybody could help me.

All the variables where the NULL value comes up are classified as factor.

If I use the function is.null(data) the answer is FALSE, which means that the have to replaced to be able to make a decent graph.

Can I use set.seed to replace all the NULL values, or I need to use a different function?

pogibas
  • 27,303
  • 19
  • 84
  • 117
L.Geerlofs
  • 97
  • 1
  • 1
  • 2
  • 5
    To make your question easier to address, please read [How to make a great R reproducible example?](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – steveb Sep 09 '17 at 17:54
  • 2
    NULL is not allowed in data frames, which is why a reproducible example would be terrific here. – Rich Scriven Sep 09 '17 at 18:25

2 Answers2

11

You can use dplyr and replace

Data

df <- data.frame(A=c("A","NULL","B"), B=c("NULL","C","D"), stringsAsFactors=F)

solution

library(dplyr)

ans <- df %>% replace(.=="NULL", NA) # replace with NA

Output

     A    B
1    A <NA>
2 <NA>    C
3    B    D

Another example

ans <- df %>% replace(.=="NULL", "Z") # replace with "Z"

Output

  A B
1 A Z
2 Z C
3 B D
CPak
  • 13,260
  • 3
  • 30
  • 48
2

In general, R works better with NA values instead of NULL values. If by NULL values you mean the value actually says "NULL", as opposed to a blank value, then you can use this to replace NULL factor values with NA:

df <- data.frame(Var1=c('value1','value2','NULL','value4','NULL'),
                 Var2=c('value1','value2','value3','NULL','value5'))

#Before
    Var1   Var2
1 value1 value1
2 value2 value2
3   NULL value3
4 value4   NULL
5   NULL value5

df <- apply(df,2,function(x) suppressWarnings(levels(x)<-sub("NULL", NA, x)))

#After
     Var1     Var2    
[1,] "value1" "value1"
[2,] "value2" "value2"
[3,] NA       "value3"
[4,] "value4" NA      
[5,] NA       "value5"

It really depends on what the content of your column looks like though. The above only really makes sense to do in the case of columns that aren't numeric. If the values in a column are numeric, then using as.numeric() will automatically remove everything that isn't a digit. Note that it's important to convert factors to character before converting to numeric though; so use as.numeric(as.character(x)), as shown below:

df <- data.frame(Var1=c('1','2','NULL','4','NULL'))

df$Var1 <- as.numeric(as.character(df$Var1))

#After
  Var1
1    1
2    2
3   NA
4    4
5   NA
www
  • 4,124
  • 1
  • 11
  • 22
  • @L.Geerlofs - If this was helpful, please remember to select a solution to help the community know it's solved and to help others with a similar question find their answer even faster. – www Sep 13 '17 at 00:18