Divide the data frame into two parts, and find the min and max respectively in R

Question

For a dummy dataset df,

df <- structure(list(date = c("2021-07-31", "2021-08-31", "2021-09-30", 
"2021-10-31", "2021-11-30", "2021-12-31", "2022-01-31", "2022-02-28", 
"2022-03-31", "2022-04-30", "2022-05-31"), PMI = c(52.4, 48.9, 
51.7, 50.8, 52.2, 52.2, 51, 51.2, 48.8, 62.7, 48.4), Exchange_rate = c(5.1, 
5.1, 4.9, 4.9, 5, 5.1, 5.3, 5.5, 5.8, 6.1, 5.9), BCI = c(54.6, 
50, 54.5, 51.6, 49.2, 45.1, 52.6, 53.8, 51.3, 40.8, 37.3)), class = "data.frame", row.names = c(NA, 
-11L))

Out:

         date  PMI Exchange_rate  BCI
1  2021-07-31 52.4           5.1 54.6
2  2021-08-31 48.9           5.1 50.0
3  2021-09-30 51.7           4.9 54.5
4  2021-10-31 50.8           4.9 51.6
5  2021-11-30 52.2           5.0 49.2
6  2021-12-31 52.2           5.1 45.1
7  2022-01-31 51.0           5.3 52.6
8  2022-02-28 51.2           5.5 53.8
9  2022-03-31 48.8           5.8 51.3
10 2022-04-30 62.7           6.1 40.8
11 2022-05-31 48.4           5.9 37.3

I'm trying to plot a time series plot with dual y axis, I set Exchange_rate to the left axis and PMI and BCI to the right axis. To achieve this, I need to get the max and min of Exchange_rate, the max and min of all values of PMI and BCI. At the same time, subtract and add an appropriate value to these minimum and maximum values, respectively, so that all values are included in the final plot.

Using print(skimr::skim(df)), I print out:

-- Variable type: numeric -----------------------------------------------------------------------------------------------
  skim_variable n_missing complete_rate  mean    sd   p0   p25  p50   p75 p100 hist 
1 PMI                   0             1 51.8  3.88  48.4 49.8  51.2 52.2  62.7 ▇▅▁▁▁
2 Exchange_rate         0             1  5.34 0.425  4.9  5.05  5.1  5.65  6.1 ▇▁▁▁▂
3 BCI                   0             1 49.2  5.74  37.3 47.2  51.3 53.2  54.6 ▁▁▁▂▇

As you can see, p0 and p100 in the result are the minimum and maximum values of each column, respectively. For the left axis, I need to get approximately c(4.5, 6.5) as the upper and lower limits of the value, and for the right axis, I need to get approximately c(37, 63) as the upper and lower limits of the value,

My expected results are as follows (not need to be exactly the same as the maximum and minimum values below):

left_y_axis_limit <- c(4.5, 6.5)
right_y_axis_limit <- c(37, 63)

Suppose we will have other data with a new range of values, given the column names that will be displayed on the left and right axes, how could we deal with this problem in an adaptive way? Thanks.

What kind of plot are you looking for? can you show some early code? Also, check [here](https://stackoverflow.com/questions/3099219/ggplot-with-2-y-axes-on-each-side-and-different-scales) for secondary axis in ggplot2 — Maël, Sep 09 '22 at 09:55
Also, check `range`: `range(df$Exchange_rate); range(c(df$PMI, df$BCI))` — Maël, Sep 09 '22 at 09:57
I use solution provided by @teunbrand from this link: https://stackoverflow.com/questions/73220807/set-left-and-right-limit-ranges-simultaneously-for-dual-y-axis-plot-using-ggplot — ah bon, Sep 09 '22 at 09:59
Now I hope to find the range in a adaptive way for `primary` and `secondary` in `sec <- ggh4x::help_secondary(name = "", primary = c(-1, 9), secondary = c(-2, 20),)`, instead of set them manually. — ah bon, Sep 09 '22 at 10:01
In this case, it will be `sec <- ggh4x::help_secondary(name = "", primary = c(4.5, 6.5), secondary = c(37, 63))`. — ah bon, Sep 09 '22 at 10:02
`range(df$Exchange_rate)` just returns the maximum and minimum values, right? But I need the new range to be slightly smaller than the original minimum and slightly larger than the maximum. — ah bon, Sep 09 '22 at 10:04

G. Grothendieck · Accepted Answer · 2022-09-09T12:26:09.027

1) You don't actually need that calculation if all you need is dual axes. The question did not specify the plot so we will assume classic graphics. Convert df to a zoo object and then use plot.zoo first plotting the Exchange_rate and then overlaying that with a PMI/BCI plot. There is a further example in the Examples section of ?plot.zoo . You may need to adjust the 0.15 according to how far you like the second y axis label away from the axis.

(continued after graphics)

library(zoo)

z <- read.zoo(df)

opar <- par(mai = c(.8, .8, .2, .8))

with(z, plot(Exchange_rate, type = "l", xlab = ""))

par(new = TRUE)
plot(z[, c("PMI", "BCI")], screens = 1, ann = FALSE, yaxt = "n", col = "blue",
  lty = 1:2)
axis(side = 4, col = "blue")
usr <- par("usr")
text(usr[2] + .15 * diff(usr[1:2]), mean(usr[3:4]), "PMI/BCI",
  srt = -90, xpd = TRUE, col = "blue")

legend(x = "topleft", bty = "n", lty = c(1, 1:2), col = c("black", "blue", "blue"),
  legend = c("Exchange Rate", "PMI", "BCI"))

par(opar)

2) With ggplot2 one can use sec.axis= but it requires that you calculate your own transformation whereas in classic graphics one can use par("usr") to get the key data. How to do it with ggplot2 is described in https://finchstudio.io/blog/ggplot-dual-y-axes/ and in ggplot with 2 y axes on each side and different scales . Note that one of the solutions in the last link provides a train_sec function which automates the calculation of the transformation and another answer shows how to do it with cowplot eliminating the need for sec.axis= entirely.

3) doubleYScale in latticeExtra allows one to have double Y axes without any calculations.

library(latticeExtra)
library(zoo)

z <- read.zoo(df)

key1 <- simpleKey(col = "black", text = "Exchange Rate", 
  lines = TRUE, points = FALSE)
key1$lines$col <- "black"
p1 <- xyplot(z$Exchange_rate, col = "black", key = key1,
  xlab = "", ylab = "Exchange rate", ylab.right = "PMI/BCI")

key2 <- simpleKey(col = "blue", text = c("PMI", "BCI"),
  lines = TRUE, points = FALSE)
key2$lines$col <- "blue"
key2$lines$lty <- 1:2
p2 <- xyplot(z[, c("PMI", "BCI")], screen = 1, col = "blue",
  key = key2, lty = 1:2)

doubleYScale(p1, p2)

Thank you for your very detailed answer. But I still, if it's possible, want to plot based on code from this link: https://stackoverflow.com/questions/73220807/set-left-and-right-limit-ranges-simultaneously-for-dual-y-axis-plot-using-ggplot, since my project is based on ggplot2, this is part of it. — ah bon, Sep 09 '22 at 13:51
Yeah, I've tested `sec.axis = sec_axis(~.*scale")` before, I met 2 issues: 1. sometimes it not easy to find an appropriate `scale`, 2. sometimes I didn't get the expected ranges for second y axis. So I gave a try with `ggh4x` package. Now I attempt to automatically find `primary = c(4.5, 6.5), secondary = c(37, 63)` for that dataset. — ah bon, Sep 09 '22 at 14:13

Divide the data frame into two parts, and find the min and max respectively in R

1 Answers1