Recoding in R using dplyr (or something else)

Question

I am for sure no expert in R yet. I grew up with SPSS and is slowly shifting to R. I solve problems as I meet them. And seek help when I get lost.

Please look at this code:

dataset$v18[dataset$s_18 == 1] <- "Agree"
dataset$v18[dataset$s_18 == 2] <- "Partly Agree"
dataset$v18[dataset$s_18 == 3] <- "Neutral"
dataset$v18[dataset$s_18 == 4] <- "Partly disagree"
dataset$v18[dataset$s_18 == 5] <- "Disagree"

sv18x <- dataset %>%
  filter(!is.na(v18)) %>%
  group_by(v18) %>% 
  dplyr::summarise(count=n()) %>% 
  mutate(pct=count/sum(count)*100) 

sv18x$v18 <- factor(sv18x$v18,levels = c("Agree", "Partly agree", "Neutral", "Partly disagree", "Disagree uenig"))
sv18x$pct<- trunc(sv18x$pct)

I feel quite confident what this can be done in a shorter and smarter way. And I think it should be done using dplyr::recode() and something else that I probably don't know yet. I just can't figure out how to do it. Can someone give me a hint?

Sorry, Lennyy. I am not quite sure what you mean. What I think is that expecially the first five lines can be written in a smarter and shorter way. I have about 50 variables that I have to recode in the same way to plot them afterwards. Best from Methods — Metods, Sep 21 '18 at 14:53
Check out `dplyr::case_when` for the top part, or use `factor` and define both `levels` and `labels` arguments. Your last two statements can be done inside `mutate` as well. — mikeck, Sep 21 '18 at 14:53
Could you provide `dataset`? If possible, edit your question according to: [How to make a great R reproducible example?](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) — nghauran, Sep 21 '18 at 14:58
`likert <- c("Agree","Partly Agree","Neutral","Partly Disagree","Disagree"); dataset$v18 <- likert[ as.integer(dataset$s_18) ]` is one way, though it is not foolproof. — r2evans, Sep 21 '18 at 15:21
Thanks a lot for your help. ANG: I will do that in the future - thanks for the tip — Metods, Sep 22 '18 at 15:42

score 1 · Answer 1 · answered Sep 21 '18 at 15:20

I simulated a reproducible example to help you, but it's hard to know what you want without the real dataset. The first part can be done with dplyr::case_when(), while the percentage part can be done with the janitor package.

library(dplyr)
library(janitor)

dataset <- data.frame(ola = sample(c("a", "b", "c", 150, replace = TRUE)),
                  s_18 = sample(1:5, 150, replace = TRUE))

dataset <- dataset %>%
   mutate(v18 = case_when(
          s_18 == 1 ~ "Agree",
          s_18 == 2 ~ "Partly Agree",
          s_18 == 3 ~ "Neutral",
          s_18 == 4 ~ "Partly Disagree",
          s_18 == 5 ~ "Disagree"
          ))

sv18x <- dataset %>%
  count(v18) %>%
  janitor::adorn_percentages("col") %>%
  janitor::adorn_pct_formatting()

Hope this helps!

Thanks a lot. For sure a better solution. I will use it. I still think it is a lot of code. I have several variables to recode. But you helped me move forward - thanks! — Metods, Sep 22 '18 at 15:43

Recoding in R using dplyr (or something else)

1 Answers1