1

I'm trying to group a variable v_435 (1, 2, 3, 4, 5, 98, 99) out of dataset ARR2 into v_435_low (including variables 1 and 2) and v_435_high (including variables 4 and 5).

Unfortunately I don't know how to exclude 98 and 99 for v_435_high.

My code:

# v_435_low
ARR2%>%
  group_by(v_435<=2) %>%
  summarize(n=n())%>%
  mutate(freq = n / sum(n) )

# v_435_high
ARR2%>%
  group_by(**????????????**) %>%
  summarize(n=n())%>%
  mutate(freq = n / sum(n) )

martis
  • 21
  • 1
  • In my opinion it would be easier to just create a new variable that does the grouping for you (with your criteria that you outlined), then use `group_by` on that new variable. Could this be a reasonable approach for you? – Harrison Jones Nov 11 '21 at 17:42
  • where does 3 fall? You say 1-2 ->low, 4-5 ->high. What about 3? – Onyambu Nov 11 '21 at 17:42
  • Thank you for your quick answers. Unfortunately I'm a total beginner, so I'm not even sure how to create a variable with the criteria outlined. 3 will be kicked out for theoretical reasons. But I don't find a way to select 4 and 5 for a variable without selecting 98 and 99. – martis Nov 11 '21 at 17:46
  • Hello @martis. Welcome to SO! Please provide a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) so that other SO users can help you the best way. Thanks. – lovalery Nov 11 '21 at 17:50

1 Answers1

0

Updating the group_by line to include only 4 and 5:

library(tidyverse)

# simulating some of my own data
ARR2 <- tibble(
  v_435 = sample(c(1:5, 98, 99), size = 100, replace = TRUE)
)

ARR2 %>%
  group_by(v_435 %in% c(4, 5))
Harrison Jones
  • 2,256
  • 5
  • 27
  • 34
  • Thank you for your help! Say I want to remain in the tidyverse pipe logic, since I want to group 1+2 answers (low)and 4+5 answers (high) off v_435 and then keep calculating: Is there any way to group these two values each into a variable using group_by() ? warm regards – martis Nov 11 '21 at 18:37
  • The answer I provided is already in tidyverse pipe logic. The `mutate` line has created a new variable called `v_435_groups` which you can then do whatever you want with. I'm not sure you fully grasp what `group_by` does, it's not meant collapse values into a single group. – Harrison Jones Nov 11 '21 at 18:57
  • If you're really just looking for what should go in `group_by` from your question, then you're looking for `group_by(v_435 %in% c(4, 5))` – Harrison Jones Nov 11 '21 at 19:00
  • Thank you so much, that was exactly what I was looking for! :) – martis Nov 11 '21 at 19:08
  • Okay fair enough. I updated my answer to reflect what you were looking for. – Harrison Jones Nov 11 '21 at 20:29