2

I have data for miles being traveled from 1970-2019 with every entry being a new month.
There is no missing data in this dataset.
Here is a snippet of the data in CSV format:

DATE,TRFVOLUSM227NFWA
1970-01-01,80173
1970-02-01,77442
1970-03-01,90223
1970-04-01,89956
1970-05-01,97972
1970-06-01,100035
1970-07-01,106392
1970-08-01,106969
1970-09-01,95254
1970-10-01,96317
1970-11-01,89684
1970-12-01,89911
1971-01-01,85336
1971-02-01,80118
1971-03-01,92974
1971-04-01,98106
1971-05-01,103655
1971-06-01,105433
1971-07-01,112466
1971-08-01,112642
1971-09-01,101290
1971-10-01,102525
1971-11-01,95555
1971-12-01,95515

Now I'm trying to create a monthplot with the ggplot library:

library(dplyr) # data management
library(tidyr) # data management
library(ggplot2) # visualization
library(gridExtra) # combine multiple plots
library(stats) # basic statistical tools
library(forecast) # main time series analysis package
library(lubridate) # data formatting
library(tseries) # addition to forecast
library(zoo) # z-ordered observations (irregular timeseries)

data = read.csv("Travelling_Statistics.csv")
timeseries = ts(data,start=c(1970,1),frequency=12) # time series in monthly frequency from January 1970
ggmonthplot(timeseries)

I get the following error:

Error in data.frame(y = as.numeric(x), year = trunc(time(x)), season = as.numeric(phase)) : arguments imply differing number of rows: 1176, 588

What could be the cause for this error message and how can it be fixed?
Everything I found online doesn't seem to be related to the problem that I have.
Thanks in advance!

Unreal RatzZ
  • 33
  • 1
  • 6
  • https://stackoverflow.com/questions/26147558/what-does-the-error-arguments-imply-differing-number-of-rows-x-y-mean – tjebo Jun 11 '22 at 11:32
  • Since both the date and miles column have exactly the same length, how could I fix this problem? – Unreal RatzZ Jun 11 '22 at 12:09

1 Answers1

2

The problem is that ts doesn't fully correctly recognise your time series. As you pass a start and frequency value, you don't need (cannot) pass the dates to ts as well, otherwise ts doesn't know what to do with the date column.

Therefore, just convert the value column to a time series:

library(forecast)

df <- read.table(text= "DATE,TRFVOLUSM227NFWA
1970-01-01,80173
1970-02-01,77442
1970-03-01,90223
1970-04-01,89956
1970-05-01,97972
1970-06-01,100035
1970-07-01,106392
1970-08-01,106969
1970-09-01,95254
1970-10-01,96317
1970-11-01,89684
1970-12-01,89911
1971-01-01,85336
1971-02-01,80118
1971-03-01,92974
1971-04-01,98106
1971-05-01,103655
1971-06-01,105433
1971-07-01,112466
1971-08-01,112642
1971-09-01,101290
1971-10-01,102525
1971-11-01,95555
1971-12-01,95515", sep = ",", header = TRUE)

timeseries = ts(df$TRFVOLUSM227NFWA,start=c(1970,1),frequency=12) # time series in monthly frequency from January 1970

ggmonthplot(timeseries)

enter image description here

tjebo
  • 21,977
  • 7
  • 58
  • 94