Figuring out conditional column transformation to make new column code with dplyr

Question

I'm working with a time series that has a couple of thousands of row, but here's a small sample of the two columns in question I want to talk about:

data <- data.frame(
        Precipitation = sample(c("0.12", "0.14", "0.08", "0.30", "0.10", "0.40", "1.6", "0", "0")),
        Character = sample(c("A", "B", "C", "D", "E", "F", "G", "H", "I")))

Each value in the Precipitation column corresponds to the letter in the Character column (i.e. 0.12 -> A, 0.14 -> B, etc.).

Each of those letters represents a potential "change" that needs to be done to the values in the Precipitation column, which is:

Precipitation values with letter A are fine as is
Precipitation values with letter B need to be divided by 2
Precipitation values with letter C need to be divided by 3
Precipitation values with letter D need to be divided by 4
Precipitation values with letter E need to be divided by 2
Precipitation values with letter F need to be divided by 4
Precipitation values with letter G need to be divided by 4
Precipitation values with letter H are fine as is
Precipitation valued with letter I are fine as is

Now, I want to make a new column using dplyr to do the divisions noted by the Character column onto the Precipitation column while also bringing over the corresponding A, H, and I rows that do not require any changes. What would the code look like to do this?

Thank you for your help! It is much appreciated.

What have you tried? You may want to look at `case_when` in `dplyr` — Calum You, Dec 14 '18 at 22:12
@CalumYou I don't know how to approach this code-wise, so I'm looking for someone to help me with an example. :) — SecretBeach, Dec 14 '18 at 22:16
see case_when (https://www.rdocumentation.org/packages/dplyr/versions/0.7.8/topics/case_when) — DJV, Dec 14 '18 at 22:35

score 1 · Accepted Answer · answered Dec 15 '18 at 07:01

Something like this? It uses case_when in dplyr

library(tidyverse)
data <- tibble(
  Precipitation = sample(c(0.12, 0.14, 0.08, 0.30, 0.10, 0.40, 1.6, 0, 0)), 
  Character = sample(c("A", "B", "C", "D", "E", "F", "G", "H", "I")))

I assume that your precipitation numbers were meant to me numbers and not characters or factors, so no quotation marks.

data2 <- data %>% 
  mutate(new = case_when(Character == "B" ~ Precipitation/2,
                         Character == "C" ~ Precipitation/3,
                         Character == "D" ~ Precipitation/4,
                         Character == "E" ~ Precipitation/2,
                         Character == "F" ~ Precipitation/4,
                         Character == "G" ~ Precipitation/4,
                         TRUE ~ Precipitation))

Anything that is not "B" to "G" is represented by TRUE and is the original value (Precipitation).

# A tibble: 9 x 3
  Precipitation Character   new
          <dbl> <chr>     <dbl>
1          0.12 F         0.03 
2          0.4  H         0.4  
3          0.3  B         0.15 
4          0.08 E         0.04 
5          0    I         0    
6          0.14 D         0.035
7          1.6  G         0.4  
8          0    C         0    
9          0.1  A         0.1

Figuring out conditional column transformation to make new column code with dplyr

1 Answers1