1

I'm just learning with R so there's probably an easier way to do this. I have a table of data that shows a set of stores with their change in market share over the same period a year ago. I've included a link to the first two periods worth of data.

Sample Data

I currently have a scatterplot that looks like this

enter image description here

Each vertical is a four-week period and each store is represented by a point based on their rank (positive or negative) within the gainers and decliners. This is close to what I'm looking for, but the spacing is all off and the datapoints blend into each other. I am trying to build something that looks more like this:

Example

Basically something that looks more like a dotplot, but has counts above and below the line. The scatterplot doesn't seem the way to go, but I can't see how to make a dotplot that will show my winners above the line and my losers below the line so that the zero line remains consistent across. Here's the code I'm using for the scatterplot:

sp1 <- ggplot(store_change_ranked, aes(x=date, y=rank)) +
        geom_point(aes(color = cut(share_chg_yag, c(-Inf, -.1, -.05, -.025, -.015, 0, .015, .025, .05, .1, Inf)))) +
        scale_color_manual(name = "Share Change",
                           values = c("(-Inf,-0.1]" = "red4",
                                      "(-0.1,-0.05]" = "red",
                                      "(-0.05,-0.025]" = "orangered",
                                      "(-0.025,-0.015]" = "darkorange2",
                                      "(-0.015,0]" = "darkorange",
                                      "(0,0.015]" = "greenyellow",
                                      "(0.015,0.025]" = "lightgreen",
                                      "(0.025,0.05]" = "green",
                                      "(0.05,0.1]" = "green2",
                                      "(0.1, Inf]" = "green4"),
                           labels = c("< -10%", " ", "-2.5% to -5.0% ", " ", "0 to -1.5%", "0 to 1.5%", " ", "2.5% to 5.0% ", " ", "10% +")) +
        labs(x = "4-Week Period", title = "Count of Stores Gaining/Losing Share",
             subtitle = "For the 13 periods ending June 2018", y = "# Stores")+
        scale_x_date(date_breaks = "1 month", date_labels = "%m-%y")+
        theme(legend.position = "right", axis.text.y = element_blank(),panel.background=element_blank(),
              panel.grid.major=element_blank(),
              panel.grid.minor=element_blank())

Any help would be appreciated.

Thanks!

Brad
  • 41
  • 3
  • [is everyone working on the same task here???](https://stackoverflow.com/questions/51169720/add-annotation-of-count-of-points-greater-or-lower-than-0-in-geom-point-plot.) (that's a link) – tjebo Jul 06 '18 at 00:28
  • also, your plot is not reproducible with the data you are giving. Have a look at [this one too please](https://stackoverflow.com/help/mcve) – tjebo Jul 06 '18 at 00:39
  • and probably, your first question is solved just using geom_jitter instead of geom_point – tjebo Jul 06 '18 at 00:42
  • 2
    There's a lot more code here than there needs to be for this issue, and not really enough data. Try to pare down the data to give just the essentials but of more observations, and clear out the stuff like scales and theme elements that aren't needed specifically to build this plot in its essence. – camille Jul 06 '18 at 00:43
  • Thanks for the feedback. That other post you linked doesn't seem to be trying to solve the same issue I am, but they definitely are similar. I gave an example of what I'm trying to get the visualization to look like so hopefully that helps. – Brad Jul 06 '18 at 14:47
  • I think that the positive about using `geom_dotplot` is that you can set the bin width to make a single dot per integer on the y axis. Note you'd need to change the binning axis to the y axis (`binaxis = "y"`) and set `binwidth = 1`. At that point I see the main issue left is fooling around with the size of the plot when you save it to make the points visible. (Note that for the dotplot you need both `color` and `fill`.) – aosmith Jul 06 '18 at 16:39
  • (As an aside, it's nice you made your data available but I will admit that, speaking for myself, I am *much* more likely to be willing to start helping on a problem like this if you've provided the code to get the data into R. For example, rather than providing the link you could provide code like `dat = read.csv("https://pastebin.com/raw/1uytxAVm")`. Think of it as a way to reduce any barriers for your would-be helpers!) – aosmith Jul 06 '18 at 16:48

1 Answers1

1

As others have pointed out in the comments your plot is not reproducible. That being said, I can't pinpoint exactly what your problem is, but I think that if you follow what I did you'll be able to plot your data the way you want.

I've simulated some data for my plot, so it won't look exactly like the one from the second picture, but it gives the same idea. Also, since I don't know what the black and gray points are I skipped them.

This is the plot I came up with: enter image description here

And this is the code for it:

# **************************************************************************** #
# Simulate Data                                                             ---- 
# **************************************************************************** #

set.seed(123)

create_data <- function(year, month, sector.rising, 
                        max.percent, max.number.sectors) {

  reps <- sum(max.number.sectors, 1)

  if(sector.rising == 1){
    multiplier <- 1
  } else multiplier <- -1

  tmp <- data.frame(
    Year.Month = factor(rep(paste0(year,",", month),reps)),
    Sector = rep(sector.rising,reps),
    Sector.Count = multiplier*seq(0, max.number.sectors),
    Percent = multiplier*sort(runif(reps,min =0, max = max.percent))
  )

  return(tmp)
}


df.tmp <- NULL

for (k.sector in 1:2){

  for (i.year in 2006:2016){
    for (j.month in 1:12) {

      if (k.sector == 1) { # 1 for rising, 2 for falling
        ran.percent <- runif(1,0,1)
      } else ran.percent <- runif(1,0,1.25)

      ran.number.sectors <- rbinom(1, 20, 0.5)

      tmp <- create_data(year = i.year,
                         month = j.month,
                         sector.rising = k.sector, 
                         max.percent = ran.percent,
                         max.number.sectors = ran.number.sectors
      )

      df.tmp <- rbind(df.tmp, tmp)

    }
  }

}

# **************************************************************************** #
# Plot                                                                      ---- 
# **************************************************************************** #

p <- ggplot(
      data = df.tmp,
      aes(x=Year.Month,
          y=Sector.Count, 
          color = cut(Percent, breaks = seq(-1.25,1,.25),include.lowest = T)
      )
    ) + 
    geom_point(
      size=2,
      alpha = 1,
      pch = 19
    ) +
    scale_x_discrete(
      position = "top",
      breaks = c("2007,1","2008,1","2009,1","2010,1","2011,1",
                 "2012,1","2013,1","2014,1","2015,1"
      ),
      labels = c("2007","2008","2009","2010","2011",
                 "2012","2013","2014","2015"
      ),
      name = ""
      ) +
    scale_y_continuous(
      limits = c(-20,20),
      breaks = seq(-20,20,5),
      labels = as.character(seq(-20,20,5)),
      name = "< SECTORS FALLING       SECTORS RISING >",
      expand = c(0,0)
      ) + 
    scale_color_manual(
      values = c("#d53e4f","#f46d43","#fdae61","#fee08b",
                 "#ffffbf","#e6f598","#abdda4","#66c2a5","#3288bd"),
      name = "", 
      drop = FALSE,
      labels = c("     ",
                 "-1%   ",
                 "     ",
                 "     ",
                 "     ",
                 "0%   ",
                 "     ",
                 "     ",
                 ".75% "),
      guide = guide_legend(
        direction = "horizontal",
        keyheight = unit(2, units = "mm"),
        keywidth = unit(2, units = "mm"),
        nrow = 1,
        byrow = T,
        reverse = F,
        label.position = "bottom",
        override.aes=list(shape=15, cex = 7),
        label.hjust = -0.4,
        title.hjust = 0.5
      )
    ) +
    theme(
      text = element_text(size = 10, color = "#4e4d47"),
      panel.background = element_blank(),
      legend.key.size = unit(1,"mm"),
      legend.position = "top",
      axis.title = element_text(size = 8, color = "#4e4d47"),
      legend.text = element_text(size = 6, color = "#4e4d47")
    )

p 
ogustavo
  • 546
  • 4
  • 12