0

One of my variable is about the type of Garbage Disposal.Heres what the summary of the field in R.

  summary(train$GarageType)

 2Types  Attchd Basment BuiltIn CarPort  Detchd    NA's 
      6     870      19      88       9     387      81

Now, I know that where ever NA is, there is No Garbage Disposal in place. Hence I need to put a value like 'null' of '' .

How to give train$GarageType <- 'null' when train$garbage = NA >

Expected OutPut will be like
      summary(train$GarageType)

     2Types  Attchd Basment BuiltIn CarPort  Detchd    NULL
          6     870      19      88       9     387      81

Such that Null is a valid kind.

Closest solution I got is

> x<-train
> x$GarageType <- factor(ifelse( is.na(x$GarageType), "NULL", x$GarageType))
> summary(x$GarageType)
   1    2    3    4    5    6 **NULL** 
   6  870   19   88    9  387   81 
> summary(train$GarageType)
 2Types  Attchd Basment BuiltIn CarPort  Detchd    **NA's** 
      6     870      19      88       9     387      81 

Now, I could rename NA with NULL but others like 2Types , Attchd etc became 1,2 etc.

user2458922
  • 1,691
  • 1
  • 17
  • 37
  • 3
    See [section 2.7 of CRAN's "An Introduction to R"](https://cran.r-project.org/doc/manuals/R-intro.html#Index-vectors). – duckmayr Nov 05 '17 at 01:45
  • Can you add some exampe code with `dput(head(train$GarageType))` and then tell us what you are hoping to see as a result? If you are trying to replace NA values than that is already answered here:https://stackoverflow.com/questions/8161836/how-do-i-replace-na-values-with-zeros-in-an-r-dataframe - but substitute "null" or "" for 0. – leerssej Nov 05 '17 at 11:16
  • If you replace NA values with "NULL" in a numeric variable, then it will become a character variable. Make sure this won't affect your analysis. – AntoniosK Nov 05 '17 at 14:15
  • Expected needs identify NULL or Empty as a TYPE of Garage Expected OutPut will be like ` summary(train$GarageType)` ` 2Types Attchd Basment BuiltIn CarPort Detchd NULL/Empty 6 870 19 88 9 387 81` – user2458922 Nov 05 '17 at 18:41
  • `NA` is the best value to use there. `NULL` is not allowed in data frames, and using a character `"null"` would coerce columns to undesirable types. – Rich Scriven Nov 05 '17 at 19:39
  • Either NULL or Empty or NILL or XYZ , I want to give a default value substituting all NA for a particular column. I am not particular about NULL alone, some string which I would like to substitute for Null – user2458922 Nov 05 '17 at 22:25
  • @JensLeerssen , That Solution does not work in this case. The difference is 'GarbageType' is the only colunm that I want to change. And its not numeric value either. I got the result
    summary(x$GarageType)
    1 2 3 4 5 6 NILL
    6 870 19 88 9 387 81
    What I need is
    2Types Attchd Basment BuiltIn CarPort Detchd NILL
    6 870 19 88 9 387 81
    As I give Summary, I need NA to be replaced by Null or Nill or some custome value.
    – user2458922 Nov 06 '17 at 15:41

3 Answers3

0

If I read your question correctly, you need to use ifelse( ). There are two ways of doing this.

#Creating a simple reproducible example:
x <- dplyr::tibble(GarageType = c(1:20, rep(NA,20))

#Changing the column directly

x$GarageType <- ifelse(is.na(x$GarageType)==TRUE, "NULL", x$GarageType)

#Creating a new dataframe with a column (needs tidyverse)

x <- x %>% mutate(GarageType = ifelse(is.na(GarageType) == TRUE, "NULL", GarageType))
Henry Cyranka
  • 2,970
  • 1
  • 16
  • 21
  • Didnt work, `summary(x$GarageType) 2Types Attchd Basment BuiltIn CarPort Detchd NA's 6 870 19 88 9 387 81 ` ... After..... `summary(x$GarageType) Length Class Mode 1460 character character` – user2458922 Nov 05 '17 at 15:11
0

If you want to replace NA of only 1 variable then we can add one new column in the data frame, apply function on that and replace old column/variable

df$newcol <- NA

Then use the for loop and if else condition

for(i in 1 to nrow(df)){
    If(is.na(df[i,oldcol])){ 
    Df[i,newcol]<-"null" 
    }else{ 
    Df[i,newcol]<- df[i,oldcol] 
    } 
    Next
}

Then assign this new var values to old var

Df$oldcol <- df$newcol

0

I found same question and answer at, https://datascience.stackexchange.com/users/41522/salb

I tried and I got the expected results.

> df <- train 
> levels <- levels(df$GarageType)
> summary(levels)
   Length     Class      Mode 
        6 character character 
> levels
[1] "2Types"  "Attchd"  "Basment" "BuiltIn" "CarPort" "Detchd" 
> levels[length(levels) + 1] <- "None"
> df$GarageType <- factor(df$GarageType, levels = levels)
> summary(df$GarageType)
 2Types  Attchd Basment BuiltIn CarPort  Detchd    None    NA's 
      6     870      19      88       9     387       0      81 
> df$GarageType[is.na(df$GarageType)] <- "None"
**> summary(df$GarageType)
 2Types  Attchd Basment BuiltIn CarPort  Detchd    None 
      6     870      19      88       9     387      81** 
user2458922
  • 1,691
  • 1
  • 17
  • 37