how to impute missing values for different values in a column

Question

I have a df:

Context <- c(HUM, HUM, DEV, HUM, DEV, HUM, DEV)
Amount <- c(100, 150, NA, NA, 500, 150, 600)

What I am interest in is imputing the missing value for when Context = DEV and When Context = HUM. So I want to inpute 2 different values in Context.

I have tried making an "if function," but something doesn't really work.

First I found the average for HUM and DEV in Context:

df %>%
  group_by(Context) %>%
  summarise(mean_amount = mean(Amount, na.rm = TRUE))

I then assigned the mean value for HUM and Dev

mean_hum <- 133
mean_dev <- 550

Then to impute a value for when Context = DEV and When Context = HUM:

df$impute_amount <- df %>%
  if (Context == "HUM") {
  ifelse(is.na(df$Amount), mean_hum, df$Amount)
  }if (Context == "Dev"){
    ifelse(is.na(df$Amount), mean_dev, 
df$Amount)
  }

However, i get the message: Error: unexpected '}' in " }"

Where am i going wrong??

I hope that someone can help me move on from here.

Thank you!

`}if (Context == "Dev"){` is either `} else if etc` or start the `if` in another code line. This is a typical example of a reason to vote to close (simple typo). — Rui Barradas, Apr 13 '18 at 15:54
I now made space btw both if-statements, but now the code can't find my column Context... ? Error: object 'Context' not found — BloopFloopy, Apr 13 '18 at 16:21

score 1 · Answer 1 · answered Apr 13 '18 at 17:40

I believe the code below does what you ask for.
First of all, the data you have supplied is wrong, you have to put "HUM" and "DEV" between quotes.
I have taken inspiration and part of the code from the accepted answer to this question. The part of the code I am talking about is the helper function impute.mean.

impute.mean <- function(x) replace(x, is.na(x), mean(x, na.rm = TRUE))

df %>%
    group_by(Context) %>%
    mutate(impute_amount = impute.mean(Amount))
## A tibble: 7 x 3
## Groups:   Context [2]
#  Context Amount impute_amount
#  <fct>    <dbl>         <dbl>
#1 HUM       100.          100.
#2 HUM       150.          150.
#3 DEV        NA           550.
#4 HUM        NA           133.
#5 DEV       500.          500.
#6 HUM       150.          150.
#7 DEV       600.          600.

DATA

df <-
structure(list(Context = structure(c(2L, 2L, 1L, 2L, 1L, 2L, 
1L), .Label = c("DEV", "HUM"), class = "factor"), Amount = c(100, 
150, NA, NA, 500, 150, 600)), .Names = c("Context", "Amount"), row.names = c(NA, 
-7L), class = "data.frame")

Thank you so much - this really helped :-) – BloopFloopy Apr 14 '18 at 08:55 — BloopFloopy, Apr 14 '18 at 08:55

how to impute missing values for different values in a column

1 Answers1