1

My issue: I am trying to have part of the coordinate system to be shaded, but with the major gridlines still visible (as in img 1). For that I use geom_rect (for lack of a better option). I have code that works with a test dataset, as can be seen in the first code snippet.

When I try that same code with another data.frame, which I can't share because I built it using a long program and some APIs (see below for a description of it), I don't get the same (expected) result: the shade is darker and, more importantly, it covers the gridlines.

Snippet 1:

test<-data.frame(a=c('1','1','2','2'),b=c(2,-2,4,-3),d=c('m','n','m','n')) 
ggplot(data=test,aes(x=a,y=b)) + 
  geom_rect(fill = 'grey', xmin = -Inf, xmax = Inf, ymin =-Inf, ymax = 0, alpha =0.05) + 
  geom_boxplot()

Img 1 - good:

enter image description here

Snippet 2:

ggplot(data = technicalsHt, aes(x = name, y = px_last)) + 
  geom_rect(fill = 'grey', xmin = -Inf, xmax = Inf, ymin =-Inf, ymax = 0, alpha =0.05) + 
  geom_boxplot(outlier.shape=NA)

Img2 - bad:

enter image description here

How can this be solved?

Comparison of the data sets:

> str(test)
'data.frame':   4 obs. of  3 variables:
 $ a: Factor w/ 2 levels "1","2": 1 1 2 2
 $ b: num  2 -2 4 -3
 $ d: Factor w/ 2 levels "m","n": 1 2 1 2

> str(technicalsHt)
'data.frame':   36 obs. of  3 variables:
 $ date   : Date, format: "2017-05-08" "2017-05-09" ...
 $ px_last: num  0.827 0.943 0.652 -0.242 -0.475 ...
 $ name   : Factor w/ 4 levels "Stock Price Strength",..: 1 1 1 1 1 1 1 1 1 2 ...

> technicalsHt
         date     px_last                 name
1  2017-05-08  0.82662887 Stock Price Strength
2  2017-05-09  0.94317706 Stock Price Strength
3  2017-05-10  0.65180657 Stock Price Strength
4  2017-05-11 -0.24172959 Stock Price Strength
5  2017-05-12 -0.47482598 Stock Price Strength
6  2017-05-15  0.67123127 Stock Price Strength
7  2017-05-16  0.71008067 Stock Price Strength
8  2017-05-17 -1.56260914 Stock Price Strength
9  2017-05-18 -1.52375974 Stock Price Strength
10 2017-05-08  0.45763568    Junk Bond Demand*
11 2017-05-09 -0.22417964    Junk Bond Demand*
12 2017-05-10 -0.86425117    Junk Bond Demand*
13 2017-05-11 -0.87816577    Junk Bond Demand*
14 2017-05-12 -0.14069205    Junk Bond Demand*
15 2017-05-15 -0.89208036    Junk Bond Demand*
16 2017-05-16 -0.61378840    Junk Bond Demand*
17 2017-05-17  1.41774297    Junk Bond Demand*
18 2017-05-18  1.73777873    Junk Bond Demand*
19 2017-05-08  1.25714740  Stock Price Breadth
20 2017-05-09  0.86192921  Stock Price Breadth
21 2017-05-10  0.81957857  Stock Price Breadth
22 2017-05-11  0.42779421  Stock Price Breadth
23 2017-05-12 -0.12824197  Stock Price Breadth
24 2017-05-15 -0.06365315  Stock Price Breadth
25 2017-05-16 -0.19438420  Stock Price Breadth
26 2017-05-17 -1.08824445  Stock Price Breadth
27 2017-05-18 -1.89192563  Stock Price Breadth
28 2017-05-08  0.85639356        120D Momentum
29 2017-05-09  0.63138711        120D Momentum
30 2017-05-10  0.67208965        120D Momentum
31 2017-05-11  0.31738619        120D Momentum
32 2017-05-12  0.05165838        120D Momentum
33 2017-05-15  0.52908486        120D Momentum
34 2017-05-16  0.35874200        120D Momentum
35 2017-05-17 -1.89159826        120D Momentum
36 2017-05-18 -1.52514351        120D Momentum

> head(technicalsHt)
        date   px_last          name
1 2016-11-14 -2.278607 120D Momentum
2 2016-11-15 -1.754333 120D Momentum
3 2016-11-16 -1.893738 120D Momentum
4 2016-11-17 -1.574128 120D Momentum
5 2016-11-18 -1.774994 120D Momentum
6 2016-11-21 -1.249234 120D Momentum

> head(test)
  a  b d
1 1  2 m
2 1 -2 n
3 2  4 m
4 2 -3 n

EDIT # 1 following @beetroot's answer The fact that my data set has many more rows seems to make a difference: the more the rows the darker the shade. But the question remains: how to ensure shading as in the first image when dealing with my dataset?

EDIT # 2 following @beetroot's answer Beetroot found a solution to address the part of the issue linked to the number of rows. Unfortunately, trying to "plug in" my data set into beetroot's code creates the following error:

ggplot() + 
  geom_rect(aes(xmin = -Inf, xmax = Inf, ymin = -Inf, ymax = 0), alpha = 0.5, fill = "grey") +
  geom_boxplot(data = technicalsHt, aes(x = as.numeric(as.character(name)), y = px_last, group = name)) +
  scale_x_continuous(breaks = c(1,2,3,4))
Warning messages:
1: In eval(expr, envir, enclos) : NAs introduced by coercion
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf
4: In min(diff(sort(x))) : no non-missing arguments to min; returning Inf
5: Removed 36 rows containing non-finite values (stat_boxplot). 

It must be that technicalsHt differs in some respect form test?

enter image description here

M--
  • 25,431
  • 8
  • 61
  • 93
hartmut
  • 934
  • 12
  • 25
  • 2
    The other option (other than not setting dataset globally) is to use `annotate`. See the answer [here](http://stackoverflow.com/questions/17521438/geom-rect-and-alpha-does-this-work-with-hard-coded-values) – aosmith May 19 '17 at 13:09
  • 1
    Possible duplicate of [geom\_rect and alpha - does this work with hard coded values?](http://stackoverflow.com/questions/17521438/geom-rect-and-alpha-does-this-work-with-hard-coded-values) – aosmith May 19 '17 at 13:09
  • 1
    In terms of the warning message, which I don't think is related to the rectangles, if you try to turn a non-number-based character vector into a number you are going to get all `NA`. This is what `as.numeric(as.character(name))` does in your code. You can get integers back using `as.numeric(as.factor(name))`. – aosmith May 19 '17 at 13:29
  • @aosmith `as.numeric(as.factor())` did it ! – hartmut Jun 19 '17 at 16:38

1 Answers1

3

The reason is that one geom_rect is plotted for each row of data (I believe), and since your second data frame has more rows, the area gets darker due to many more geom_rects being plotted on top of each other.

For instance, look at this plot:

test2 <- rbind(test, test, test)
ggplot(data = test2, aes(x = a, y = b)) + 
  geom_rect(fill = 'grey', xmin = -Inf, xmax = Inf, ymin = -Inf, ymax = 0, alpha = 0.05) + 
  geom_boxplot()

enter image description here

You can avoid this if you move the data argument from ggplot() to geom_boxplot() and put the values in aes() (however, as geom_rect() has a continuous scale I had to transform a, which might not be ideal):

ggplot() + 
  geom_rect(aes(xmin = -Inf, xmax = Inf, ymin = -Inf, ymax = 0), alpha = 0.5, fill = "grey") +
  geom_boxplot(data = test, aes(x = as.numeric(as.character(a)), y = b, group = a)) +
  scale_x_continuous(breaks = c(1,2))

Plot for test (increased alpha to 0.5 for visibility): enter image description here

Plot for test2 (increased alpha to 0.5 for visibility):

enter image description here

erc
  • 10,113
  • 11
  • 57
  • 88
  • Thank you so much for this answer. I think you have a point when saying the number of rows in the data set is different. Thanks. Putting the data in the geom_boxplot as you do leads to no shading at all though (see the second image you uploaded)!! – hartmut May 19 '17 at 10:21
  • Thank you very much once more. Your solution works with the test dataset... but not with the one I m initially working with. I edited my question correspondingly, and I uploaded a reduced version of the data I am using. For sure we are progressing, but part of the stories seems to still be missing. thanks – hartmut May 19 '17 at 13:06