0

I am trying to plot the US yield curve on a graph but I would like the ticks of the x-axis to be nearer until 1 year, then normal. I have not found how to solve it. Does anyone know?

My data set is :

Interest February August
1 Mo 2.186 0.035
3 Mo 2.626 0.279
6 Mo 3.128 0.590
1 Y 3.290 0.900
2 Y 3.214 1.368
3 Y 3.149 1.631
5 Y 2.923 1.824
10 Y 2.864 1.924
20 Y 2.774 2.323
30 Y 2.999 2.253

I have a graph so with the following code:


a$level_order <- factor(a$Interest, c('1 Mo', '3 Mo', '6 Mo','1 Y','2 Y','3 Y', '5 Y','7 Y', '10 Y', '20 Y', '30 Y'))
ggplot(a, aes(x=level_order,y=August,group=1))+
  geom_line(color="darkred")+
  geom_point(color="darkred")+
  geom_point(aes(y=February),color="darkblue")+
  geom_line(aes(y=February),color="darkblue")+

  theme_bw()

I would like to reduce the width between 1 month and 1 year

What I would like to get look like it ( a steeper slope before 1 Year) with 1 Mo 3 Mo 6 Mo pretty near.

What I would like to achieve

divibisan
  • 11,659
  • 11
  • 40
  • 58
jx21
  • 50
  • 6
  • do you mean you want to have the x-axis spaced correctly spaced out based on time elapsed ? – Mike Aug 09 '22 at 15:12
  • Maybe the second answer here helps: https://stackoverflow.com/questions/53500003/change-x-axis-names-in-ggplot Check also https://ggplot2.tidyverse.org/reference/scale_date.html – RobertoT Aug 09 '22 at 15:14
  • Maybe I was not enough clear. I would like that 1Mo 3Mo and 6Mo to be pretty near each other while after 1 Y the width stays the same like a log scale – jx21 Aug 09 '22 at 15:21

2 Answers2

2

The problem is you are using as x-axis a list of factors and ggplot take them as discrete data, plotting them with equal distance.

  1. You can use scale_x_discrete() and manually write every break and every label for each break which is a bit tedious and you have to re-adapt it if you add new rows.

  2. You can create a new numeric -continuous- column from your character/factor column Interest with the same magnitude and then use scale_x_continuous().

Example:

example = data.frame(
  Interest = c('1 Mo', '2 Mo', '6 Mo', '1 Y', '2 Y', '5 Y'), 
  February = c(2.186,2.626,3.128,3.290,3.214,2.923),
  August = c(0.035,0.279,0.590,0.900,1.368,1.824)
)

library(dplyr)
 
example %>% # use extract() to quickly split numbers and letters
  extract(Interest, into=c("Num", "time"), "([0-9]*)(.*)") %>% 
  mutate(Num = as.numeric(Num),
         Year_period = ifelse(time == " Mo", Num/12, Num)) %>% # change months magnitude to years
  
  ggplot(aes(x=Year_period,y=August,group=1))+
  geom_line(color="darkred")+
  geom_point(color="darkred")+
  geom_point(aes(y=February),color="darkblue")+
  geom_line(aes(y=February),color="darkblue")+
  scale_x_continuous(name="Years",breaks = seq(1,5,1), labels = str_c(seq(1,5,1),"Y"))+ # Here you re-scale the x-axis
  theme_bw()

Within scale_x_continuous(), breaks= says where to tick the labels, giving an array of integers for the years, labels=str_c(seq(1,5,1),"Y")) is creating the names for each tick. Because all "Mo" rows are divided by 12, so < 1, now the x-axis is transformer to represent years.

Output. Note I haven't use the whole data. Adapt breaks and labels for all your rows (if your max year is '30Y' then breaks = seq(1,30,1) and labels=str_c(seq(1,30,1),"Y")). Now the first values ('Mo') are closer and < '1Y' .

enter image description here

RobertoT
  • 1,663
  • 3
  • 12
2

There isn't a great way to do this natively, since the tidyverse philosophy is opposed to this kind of false scaling. The "correct" way to do this is to have a consistent axis, either linear or log-scaled.

That being said, while it's a bit hacky, we achieve what you want by creating a pre-scaled integer scale for the x-axis:

b <- a %>%
    group_by(yr = grepl('Y$', Interest)) %>%
    mutate(level = row_number(),
           level = if_else(yr, as.integer(3*level+1), level))

This gives us this table where each position on level is a consistent distance on the x-axis, while yr identifies the values which should have a tick.

   Interest February August yr    level
   <chr>       <dbl>  <dbl> <lgl> <int>
 1 1 Mo         2.19  0.035 FALSE     1
 2 3 Mo         2.63  0.279 FALSE     2
 3 6 Mo         3.13  0.59  FALSE     3
 4 1 Y          3.29  0.9   TRUE      4
 5 2 Y          3.21  1.37  TRUE      7
 6 3 Y          3.15  1.63  TRUE     10
 7 5 Y          2.92  1.82  TRUE     13
 8 10 Y         2.86  1.92  TRUE     16
 9 20 Y         2.77  2.32  TRUE     19
10 30 Y         3.00  2.25  TRUE     22

Now, we can just plot with a continuous x-axis, using the values of level for breaks and Interest for labels, with yr determining whether to show the label or not (you can drop that selector if you want to show the ticks for the monthly intervals):

ggplot(b, aes(x=level,y=August,group=1))+
    geom_line(color="darkred")+
    geom_point(color="darkred")+
    geom_point(aes(y=February),color="darkblue")+
    geom_line(aes(y=February),color="darkblue")+
    scale_x_continuous(breaks=b$level[b$yr],
                       labels=b$Interest[b$yr]) +
    theme_bw()

enter image description here

divibisan
  • 11,659
  • 11
  • 40
  • 58
  • The problem using this calculated 'level' is you are equally distancing the 3 months, but they aren't. – RobertoT Aug 09 '22 at 16:18
  • @RobertoT You're right, but OP doesn't want them to be accurately spaced. Your answer is the **correct** one in terms of data visualization, but for whatever reason, OP wants the first 3 close together and then the year ticks evenly spaced. If they wanted a different scaling for the months, they could change how that level factor is set. For example, they could keep the months as 1, 3, 6, then set the year rows to 12*level+1 – divibisan Aug 09 '22 at 16:24
  • Thanks, I see that it works. However, the tuning parameters are more complex than the other solution. Nevertheless, it remains helpful in understanding how R works. – jx21 Aug 09 '22 at 16:28
  • 2
    @jx21 I definitely think the other answer is a more accurate visualization, but if you do want your x-axis to be a continuous scale, scaled by time, rather than a discrete scale with each "year bin" evenly spaced, you should update your question, since you specifically ask for the latter – divibisan Aug 09 '22 at 16:31
  • @divibisan you are right the other answer is more accurate. That's why I selected it. – jx21 Aug 09 '22 at 16:35