3

I am building multiple forecasts in R and I am trying to select certain columns from the forecast output. Below is what the fable looks like:

> head(forData)
# A fable: 6 x 8 [1M]
# Key:     .model [1]
  .model     Month             ABC .mean DateVar             PCT     Ind1    Ind2
  <chr>      <mth>          <dist> <dbl> <dttm>              <dbl>   <dbl>   <dbl>
1 average 2021 Jul N(0.31, 0.0017) 0.315 2021-07-01 00:00:00  3.25       0       0
2 average 2021 Aug N(0.33, 0.0024) 0.328 2021-08-01 00:00:00  3.25       0       0
3 average 2021 Sep N(0.33, 0.0029) 0.329 2021-09-01 00:00:00  3.25       0       0
4 average 2021 Oct N(0.32, 0.0038) 0.322 2021-10-01 00:00:00  3.25       0       0
5 average 2021 Nov N(0.33, 0.0044) 0.328 2021-11-01 00:00:00  3.25       0       0
6 average 2021 Dec N(0.33, 0.0051) 0.326 2021-12-01 00:00:00  3.25       0       0

When I try to use dplyr to select any columns I get the following error:

> forData %>% select(Month, .mean)
Error: Can't subset columns that don't exist.
x Column `ABC` doesn't exist.

The code below gives me a vector of both Month and .mean so I assume the names are correct but I can't understand the error it gives.

forData$Month
forData$.mean
user1723699
  • 1,031
  • 6
  • 13
  • 27

2 Answers2

3

We can use backquote to select after converting to tibble

forData %>% 
        as_tibble %>% 
         select(Month, `.mean`)
akrun
  • 874,273
  • 37
  • 540
  • 662
3

The underlying issue here (which is not clear from the error, I'll try to improve this) is that a <fable> must contain a distribution column. By selecting Month and .mean, you are removing the ABC (distribution) column which is required. If you no longer want the distribution, you will need to convert to a different data class, there are two main options here:

  • a <tsibble> with as_tsibble() (which requires the time column Month that you still have)
  • a <tibble> with as_tibble() (which has no requirements on the columns it contains)