-1

I have a time series data with a column for a month and a column for a year. The months are JAN, FEB, etc.

I'm trying to combine them into one month year variable in order to run time series analysis on it. I'm very new to R and could use any guidance.

Becky
  • 17
  • 1
  • 3
  • 2
    Welcome to SO, Becky! Questions on SO (especially in R) do much better if they are reproducible and self-contained. By that I mean including attempted code (please be explicit about non-base packages), sample representative data (perhaps via `dput(head(x))` or building data programmatically (e.g., `data.frame(...)`), possibly stochastically after `set.seed(1)`), perhaps actual output (with verbatim errors/warnings) versus intended output. Refs: https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info. – r2evans May 06 '20 at 23:50
  • If you're talking "time series", then you should consider making them a "proper" `Date` object. Internally, it's naturally numeric, so comparisons (ordinality and gaps) are directly supported. As strings (`"JAN"` or `"2019JAN"` or `"201901"`), they can sort correctly (assuming most-significant first and not alphabetic months), but they are *categorical* and therefore processing them takes a bit more effort. – r2evans May 07 '20 at 00:35

3 Answers3

1

Perhaps something like this?

library(dplyr)

c("JAN", "FEB", "MAR", "APR",
  "MAY", "JUN", "JUL", "AUG",
  "SEP", "OCT", "NOV", "DEC") %>%
  rep(., times = 3) %>%
  as.factor() -> months

c("2018", "2019", "2020") %>%
  rep(., each = 12) %>%
  as.factor() -> years

df1 <- cbind.data.frame(months, years)

paste(df1$months, df1$years, sep = ".") %>%
  as.factor() -> merged.years.months
Tavaro Evanis
  • 180
  • 1
  • 11
0

Start with your month/year df.

library(tidyverse)
library(lubridate)
events <- tibble(month = c("JAN", "MAR", "FEB", "NOV", "AUG"),
       year = c(2018, 2019, 2018, 2020, 2019))

Let's say that each of your time periods start on the first of the month.

series <- events %>% 
  mutate(mo1 = dmy(paste(1, month, year)))

This is what you want

R > series
# A tibble: 5 x 3
  month  year mo1       
  <chr> <dbl> <date>    
1 JAN    2018 2018-01-01
2 MAR    2019 2019-03-01
3 FEB    2018 2018-02-01
4 NOV    2020 2020-11-01
5 AUG    2019 2019-08-01

These are now dates;you can use them in other analyses.

David T
  • 1,993
  • 10
  • 18
0

Base R solution:

events <- within(events,{
                    month_no <- as.integer(as.factor(sort(month)))
                    date <- as.Date(paste(year, ifelse(nchar(month_no) < 2, paste0("0", month_no),
                                                       month_no), "01", sep = "-"), "%Y-%m-%d")
                    rm(month_no, month, year)
                    }
                )
hello_friend
  • 5,682
  • 1
  • 11
  • 15