-1

I am working on a use-case to achieve the attached expected output (ignore the alignment of legend names) using R.

enter image description here

library(ggplot2)
library(tidyverse)
library(reshape2)


#Creating a dataframe with use-case specific variables. (Missing values are intentional to replicate real-time data)
df = data.frame(
                Year = 2006:2025,

                Survey = c(40.5, 39.0, NA, NA, NA, NA, 29.9, NA, NA, NA, 21.6,
                             NA, NA, NA, NA, NA, NA, NA, NA, NA),

                WhatIf = c(NA, NA, NA, NA, NA, NA, 29.9, NA, NA, NA, NA,
                             NA, NA, NA, NA, NA, NA, NA, NA, 14.9),

                AARR = c(NA, NA, NA, NA, NA, NA, 29.9, NA, NA, NA, NA,
                             NA, NA, NA, NA, NA, NA, NA, NA, 13.0),

                Current = c(NA, NA, NA, NA, NA, NA, 29.9, 27.6, 25.4, 23.4, 21.6,
                             19.9, 18.4, 16.9, 15.6, 14.4, 13.3, NA, 12.2, 11.3)
                  )

# Method 1

#Data transformation using melt package
# df_long <- melt(df,id.vars = "Year")
#
# ggplot(data=df_long,aes(x=Year,y=value, colour=variable)) +geom_line() +
#   theme(legend.position="bottom")

#Method 2

#Plot and adding lines - Year vs. rest of the columns
plot(df$Year, df$Survey, type = "o", col = "dark grey", pch = "o", lty = 1, ylim = c(0,max(df$Survey, na.rm = T)), ylab = "Survey")

points(df$Year, df$WhatIf, col="orange" )
lines(df$Year, df$WhatIf, col="orange",lty=2)

points(df$Year, df$AARR, col="black")
lines(df$Year, df$AARR, col="black", lty=2)

points(df$Year, df$Current, col="dark blue")
lines(df$Year, df$Current, col="dark blue", lty=2)

legend(1, 100, legend=c("Survey","WhatIf", "AARR","Current"),
       col=c("dark grey","orange","black", "dark blue"),
       lty=c(1,2,2,2), ncol=1)

Created on 2020-06-30 by the reprex package (v0.3.0)

I have tried two methods in R to simulate a basic plot with legends and two y-axes, but unfortunately, I am unable to produce the expected output. So the columns Current, WhatIf, AARR would be dashed lines with different colours and Survey column is represented by a symbol "o".

enter image description here

Any recommendations on the approach are greatly appreciated.

  • what exactly is it that you are trying to achieve? What have you tried? I feel this question may benefit from some more details, and more "conciseness" – tjebo Jun 30 '20 at 14:05
  • most importantly it is not clear if you want to achieve this with base R or with ggplot? – tjebo Jun 30 '20 at 14:10
  • @Tjebo, thanks for the response. What I'm trying to achieve is mentioned in the title (Line plot - Year column against the others using R ) with an attachment of the expected output plot. I tried two methods as mentioned in the comments. 1st method using ggplot by applying data transformation and 2nd method by using base R's plot function. – pradeepvaranasi Jun 30 '20 at 14:14
  • Is the question then about multiple lines for several variables? I think this should then answer your question. https://stackoverflow.com/questions/3777174/plotting-two-variables-as-lines-using-ggplot2-on-the-same-graph – tjebo Jun 30 '20 at 14:24
  • Or is it about drawing a line across missing values? https://stackoverflow.com/questions/9617629/connecting-across-missing-values-with-geom-line – tjebo Jun 30 '20 at 14:30
  • Or is it a question about secondary y-axis? https://stackoverflow.com/questions/3099219/ggplot-with-2-y-axes-on-each-side-and-different-scales – tjebo Jun 30 '20 at 14:31
  • I am sending this because your question is just not clear about this. – tjebo Jun 30 '20 at 14:31

1 Answers1

1

As pointed, out, your initial question is not very clear on which aspect of your plot and code you are having trouble. However, reading your question and responding comments, it seems you are looking to replicate the image/plot you attached, which appears to have some spaces between points (separated by NA values), but also has some other key elements:

  1. A secondary y axis
  2. Some specific formatting with legend positioning and axis labels
  3. Formatting of lines as dashed and open circles for the labeled points

Here's the code that I came up with to replicate what I understand is your desired plot. I'll then explain a bit about how I used ggplot2 to configure some of the elements indicated above.

ggplot(df_long, aes(x=Year, y=value, color=variable)) +
  theme_classic() +
  
  # data plotting layers
  geom_line(linetype='dashed', size=1) +
  geom_point(shape=21, size=3, fill='white') +
  
  # scale elements
  scale_y_continuous(
    breaks=seq(0,100, 10), labels = seq(0, 100, 10), limits=c(0,70), expand=expansion(mult=0.02),
    sec.axis = sec_axis(
      name='Stunting (%)', trans='identity',
      breaks=seq(0,100,10), labels=seq(0,100,10))) +
  
  # theme and overall plot look elements
  theme(
    legend.position = 'bottom', legend.direction = 'vertical',
    panel.grid.major.y = element_line(color='gray85'),
    axis.title = element_text(face='bold')
  ) +
  guides(color=guide_legend(ncol=2)) +
  labs(
    x='Year', y='Stunting (%)', color=NULL
  )

enter image description here

Data plotting layers: For lines, note the formatting using linetype. For points, use of shape=21 gives you an open circle with a fill value (here specified).

Secondary Y axis: Adding a secondary y axis is performed here with the sec.axis= argument within relevant scale_*_ functions. You call sec_axis() function and can specify the various arguments for breaks=, labels=, and name=. It's important to note that the primary axis title can be made through ylab() or labs(y=), whereas the secondary axis title should be specified through scale_*_(sec.axis=sec_axis(name=.... Finally, since the secondary axis mirrors that of the primary axis... the trans= argument in sec_axis() is just "identity". It's very strange to have a secondary axis that is identical to the primary, but I will not question further the layout of the data here.

Theme Elements: Most of this stuff should be self-explanatory by looking at the code. Note moving the legend to the bottom while retaining the direction. Also, I used guide_legend() to specify that there should be 2 columns so you get the 4 keys in a 2x2 layout (note... didn't really need legend.direction= there too, but it doesn't hurt).

chemdork123
  • 12,369
  • 2
  • 16
  • 32
  • I really appreciate your time in explaining the nitty-gritty. But, the plot doesn't completely match my expected output. The details such as multiple lines for respective variables with missing values. Of course, your recommendations would help me get starting/improving my code. Thank you! – pradeepvaranasi Jul 07 '20 at 13:27
  • It's not clear what you were asking, then it seems. Can you elaborate or show a picture of what you were looking to do? You can edit your original question to clarify the intended results. If I am interpreting your original question, you were looking to have dotted lines for AAPR, Whatif, and Current, but AAPR and Whatif only contain one point each in the data, so it's not possible to draw a line for each of them (they can only be represented by points). – chemdork123 Jul 09 '20 at 12:52