Removing missing from specific variables in ggplot

Question

I am currently creating a bar graph with a series of related variables. When answering my survey questions, not every variable was applicable to each person. People had the option of choosing, increased, same, decreased, not applicable. Meaning, each variable included in my bar graph has varying NA responses.

I want to create a bar graph that uses the complete cases for each variable (not the complete cases for the dataset) that only displays the "Decreased" value. I do not want to include the NA in the overall value for each variable, which means that each variable has a different total N. How do I go about doing this? When I run functions like df <- na.omit(df) it removes all the NAs from the dataset, which is not what I want.

This is the code that I have written so far to set up my dataset.



df <- select(dataset, variable1, variable2, variable3, variable4, 
                  variable5, variable6, variable7)

df2 <-df %>% 
  pivot_longer(everything(), names_to="vars", values_to ="value") %>% 
  group_by(vars, value) %>% 
  dplyr::summarise(n = n()) %>%  
  mutate(pct = (n / sum(n)*100)) %>% 
  ungroup() %>% 
  subset(value == "Decreased") 

plot <-df2 %>%
ggplot() + 
  geom_col( aes(vars, pct),
             fill="black")

It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. — MrFlick, Apr 03 '23 at 18:52
Please share a few rows of sample input data that illustrate your problem. `dput()` is the friendliest way to share data, `dput(df[1:10, ])` will make a copy/pasteable version of the first 10 rows of `df` - choose a suitable small subset to illustrate your issue. — Gregor Thomas, Apr 03 '23 at 18:53

score 0 · Answer 1 · answered Apr 03 '23 at 20:20

0

You can use the following code after your mutate line:

 filter(!is.na(pct) == TRUE)

This code will remove the NAs only from 'pct' variable.

If you have more varibles to exclude NAs, just add the same argument in the filter function for each one.

answered Apr 03 '23 at 20:20

tales_alencar

141
1
10

Removing missing from specific variables in ggplot

1 Answers1