1

I am plotting a combination of points and lines with 2 sets of labels. I want to plot the labels above and below the points as you can see below. At the same time to make it better to read I have used the log10 scale on the y axis and everything is fine except some of the labels are outside of the chart area and I have used and tried every method suggested in many posts to see if I get any favorable result or not. I am looking for either of the solutions:

1- expand the y axis to negative so that the labels can be seen. Note that ylim or limits=c(x,y) is not working for log scale or sqrt scale if the numbers are negative

2- trick the geom_text to make the labels be visible regardless of the y limits. Note that I have tried vjust="inward" and it is working ok, but then I have to use geom_text_repel which moves the labels around and makes it hard to read, so I still like to place the labels right on top and bottom of the points

any helps is appreciated!

Here is the code to generate the data frame:

df1_InSAP_Only <- structure(list(Year_Month = c(
    "2016_06", "2016_06", "2016_07", 
    "2016_07", "2016_08", "2016_08", "2016_09", "2016_09", "2016_09", 
    "2016_09", "2016_10", "2016_10", "2016_10", "2016_10", "2016_11", 
    "2016_11", "2016_12", "2016_12", "2017_01", "2017_01", "2017_01", 
    "2017_02", "2017_02", "2017_02", "2017_02", "2017_03", "2017_03", 
    "2017_03", "2017_03", "2017_03", "2017_03", "2017_04", "2017_04", 
    "2017_04", "2017_04", "2017_04", "2017_05", "2017_05", "2017_05", 
    "2017_05", "2017_05", "2017_05", "2017_06", "2017_06", "2017_06", 
    "2017_06", "2017_06", "2017_06", "2017_07", "2017_07"), 
    Business = c("A", 
     "E", "A", "B", "B", "E", "F", "A", "H", "B", "A", "D", "B", "E", 
     "B", "E", "F", "B", "F", "B", "E", "A", "B", "C", "E", "F", "A", 
     "G", "D", "B", "E", "F", "A", "G", "B", "E", "F", "A", "D", "B", 
     "C", "E", "F", "A", "D", "B", "C", "E", "F", "A"),
    `MMR Count` = c(2L, 
      1L, 1L, 7L, 2L, 1L, 1L, 3L, 1L, 5L, 1L, 1L, 4L, 1L, 8L, 4L, 1L, 
      4L, 2L, 2L, 2L, 3L, 8L, 1L, 2L, 1L, 7L, 1L, 4L, 9L, 2L, 4L, 10L, 
      2L, 15L, 7L, 4L, 27L, 2L, 14L, 1L, 6L, 9L, 31L, 5L, 14L, 1L, 
      4L, 5L, 21L),
     `Duration Average` = c(37, 20, 9, 8, 2, 5, 1, 1, 
      1, 14, 1, 19, 8, 1, 21, 77, 1, 18, 8, 1, 1, 194, 9, 14, 19, 1, 
      10, 1, 6, 9, 18, 4, 12, 170, 7, 35, 9, 10, 7, 12, 3, 15, 5, 9, 
      10, 10, 18, 11, 16, 14)), .Names = c("Year_Month", "Business", 
      "MMR Count", "Duration Average"), row.names = c(NA, 50L), class = "data.frame")

Here is the code that generates the plot:

library(ggplot2)
ggplot(df1_InSAP_Only,
        aes(x=Year_Month,
            y=`Duration Average`,
            group=Business,
            color=Business,
            size=`MMR Count`)) +
  geom_line(aes(group=Business),stat="identity", size=1, alpha=0.7) +
  geom_point(aes(colour=Business, alpha=0.7)) +
  facet_wrap(~ Business, ncol=2) +
  scale_y_log10( limits=c(-100,1000),breaks=c(0,1,10,100,1000)) +   
  scale_alpha_continuous(range = c(0.5,1), guide='none') + #remove the legend for alpha
  geom_text(data=. %>% dplyr::group_by(Business),
            aes(label=`Duration Average`,vjust=-2),
            size=3,
            position = position_dodge(width=0.9)) +
  geom_text(data=. %>% dplyr::group_by(Business),
            aes(label=`MMR Count`,vjust=3),
            size=3,
            position = position_dodge(width=0.9),
            color="brown")

and here is the plot:

enter image description here

pogibas
  • 27,303
  • 19
  • 84
  • 117
Ibo
  • 4,081
  • 6
  • 45
  • 65
  • 1
    *"Unfortunately, I cannot give you a reproducible dataset because the whole data is pulled from a database and undergo many calculations"* We don't need your real data - we just need *something* with similar structure that illustrates the problem - something to to demonstrate a solution on. 10-20 rows if plenty. Popular choices are built-in data and simulated data. – Gregor Thomas Jul 13 '18 at 17:55
  • You can add data using `dput` function – pogibas Jul 13 '18 at 17:56
  • As a side-note, you should never use `data$column` inside `aes()`. Change `group=df1_InSAP_Only$Business` to `group = Business`. And I can't imagine that `dplyr::group_by(Business)` is doing anything useful inside the plot layer. – Gregor Thomas Jul 13 '18 at 17:58
  • @Gregor `you should never use data$column inside aes()`, why? – Ibo Jul 13 '18 at 18:03
  • it's not how ggplot2 was create to work (not the most optimal way to use it), please go through `ggplot2` tutorials for that – pogibas Jul 13 '18 at 18:04
  • I added a reproducible example now – Ibo Jul 13 '18 at 18:17
  • 2
    [R - ggplot2 - difference between ggplot(data, aes(x=variable…)) and ggplot(data, aes(x=data$variable…))](https://stackoverflow.com/a/51194689/903061) – Gregor Thomas Jul 13 '18 at 18:24

1 Answers1

4

You can't make a logged y-scale go negative - logs of a negative number are undefined. Just make it go closer to 0. Here's your graph with

scale_y_log10(limits=c(.1, 1000),breaks=c(1, 10, 100, 1000))

enter image description here

If you want more (will depend on the size of the final plot, size of the text, amount of your vjust), go to 0.05, or 0.01...

I'd also highly recommend using a Date format for your x-axis data, look how much nicer these axis labels are (and how the plot looks cleaner with fewer vertical gridlines).

df1_InSAP_Only$date = as.Date(paste0(df1_InSAP_Only$Year_Month, "_01"), format = "%Y_%m_%d")

 # use date column on x-axis
 # reduce vjust amounts
 # get rid of meaningless group_by() statements
 # get rid of unused position dodges
ggplot(df1_InSAP_Only,
        aes(x=date,
            y=`Duration Average`,
            group=Business,
            color=Business,
            size=`MMR Count`)) +
  geom_line(aes(group=Business),stat="identity", size=1, alpha=0.7) +
  geom_point(aes(colour=Business, alpha=0.7)) +
  facet_wrap(~ Business, ncol=2) +
  scale_y_log10( limits=c(.1,1000),breaks=c(1,10,100,1000)) +   
  scale_alpha_continuous(range = c(0.5,1), guide='none') + #remove the legend for alpha
  geom_text(aes(label=`Duration Average`,vjust=-1),
            size=3) +
  geom_text(aes(label=`MMR Count`,vjust=2),
            size=3,
            color="brown")

enter image description here

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • I am trying to add the mean of data as a horizontal line, but I could not get around it, I posted a new question for it: https://stackoverflow.com/questions/51332140/how-to-add-the-mean-of-data-as-a-horizontal-line-on-a-faceted-plot-in-r-ggplot2 – Ibo Jul 13 '18 at 20:05