Graphing percentage completion at mult time invervals and separating by a third variable

Question

df<-data.frame(location=c(1:5),Jan=c(0.5,0.6,0.5,0.4,0.55),Feb=c(0.4,0.33,0.5,0.33,0.4),Mar=c(0.25,0.2,0.33,0.25,0.35))
  location  Jan  Feb  Mar
1        1 0.50 0.40 0.25
2        2 0.60 0.33 0.20
3        3 0.50 0.50 0.33
4        4 0.40 0.33 0.25
5        5 0.55 0.40 0.35

The above data frame is measuring percent completion by location by month. I would like a graph illustrating change over time essentially. I've tried something similar to the following, but cannot seem to get it right.

ggplot(df,aes(x=c(Jan,Feb,Mar), y = Jan:Mar, color = location))+geom_point()

jared_mamrot · Accepted Answer · 2022-09-01T00:09:39.443

1

There are many, many ways to plot your example dataset. Does this approach solve your problem?

library(tidyverse)
library(scales)

df <- data.frame(location=c(1:5),
                 Jan=c(0.5,0.6,0.5,0.4,0.55),
                 Feb=c(0.4,0.33,0.5,0.33,0.4),
                 Mar=c(0.25,0.2,0.33,0.25,0.35))

df %>%
  pivot_longer(-location,
               names_to = "month") %>%
  mutate(month = factor(month, 
                        levels = month.abb,
                        ordered = TRUE)) %>%
  ggplot(aes(x = month, y = value)) +
  geom_point() +
  geom_line(aes(group = 1),
            lty = 2) +
  facet_wrap(~location, nrow = 1, labeller = label_both) +
  scale_y_continuous(labels = percent)

^{Created on 2022-09-01 by the reprex package (v2.0.1)}

More details: your example dataset is currently in the 'wide' format, but if you use the pivot_longer() function to convert it to 'long' format it makes it easier to work with (https://tidyr.tidyverse.org/reference/pivot_longer.html). Then, use mutate() to convert the 'type' of each month from character to a factor so that the "months" can be ordered properly (i.e. not alphabetical order (Feb, Jan, Mar) but in the same order as the built-in dataset month.abb (Jan, Feb, Mar); e.g. Sorting months in R). Then use ggplot2 to add dots and lines, and use facet_wrap() to create one plot per location. Finally, format the y-axis labels to percentages (i.e. instead of 0.4, 40%).

edited Sep 01 '22 at 00:09

answered Sep 01 '22 at 00:00

jared_mamrot

22,354
4
21
46

Thank you for taking the time. You're right I'm sure there's a hundred solutions. What I hadn't had sorted was how to get the data in the right format...factoring in this case. What I still don't understand is why my own factors all come out as NAs. The plot still works, but I'm curious if this is a fluke and will it work again. – M3Lba Sep 01 '22 at 01:45
This doesn't sound good: "What I still don't understand is why my own factors all come out as NAs". If you are able to edit your question to provide a [minimal, reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) I can help you troubleshoot. Is your problem related to your previous question: https://stackoverflow.com/questions/73391878/dplyrcase-when-vs-if-else-for-summary-column-requiring-two-conditions ? – jared_mamrot Sep 01 '22 at 03:21
Sorry I didn't get back to you sooner. I was finally able to get back to this project today. after retrying some code, I found that in the full_join I did to get this dataset was the cause of the appearance of NA factors from my code. Once I fixed that the solutions worked great. It wasn't the code for the plot that gave me NAs. – M3Lba Sep 14 '22 at 23:49
Glad you got it sorted :) – jared_mamrot Sep 15 '22 at 01:01

Graphing percentage completion at mult time invervals and separating by a third variable

1 Answers1