Thanks for asking and sorry about the issue. Indeed, the API interface in Rbeast is kinda confusing because it was originally coded to handle satellite time series.
Regardless, the BEAST model in the package was formulated only for regular time series.(By regular, I mean equally-spaced time series with the same number of data points per period.) Because leap years have 366 days but others have 356 days, daily time series are treated in BEAST as irregular time series if the periodicity is one year. However, if the periodic variation is weekly/7 days, daily time series are considered as regular. In order to handle irregular time series, I implemented the beast.irreg
function which accepts irregular inputs and aggregate them into regular time series before doing the decomposition and changepoint detection.
To illustrate, I got a sample PM10 dataset for several regions (e.g., WOLLONGONG", and "MUSWELLBROOK") from this site https://www.dpie.nsw.gov.au/air-quality/air-quality-data-services/data-download-facility, and I posted the CSV file (as well as another dataset on ozone) under https://github.com/zhaokg/Rbeast/tree/master/R/SampleData. You can directly read the files from R as shown below:
df = read.csv('https://github.com/zhaokg/Rbeast/raw/master/R/SampleData/pm10_1658112168.csv',header=FALSE,skip=3)
dates = as.Date(df[,1], "%d/%m/%Y") # the 1st col is dates
Y = df[,2:ncol(df)] # the rest are PM10 data for the several sample sites
# e.g., Y[,1] for the first region (WOLLONGONG)
library(Rbeast)
o = beast.irreg(log(Y[,1]),time=dates,deltat=1/12, freq=12, tseg.min=3, sseg.min=6)
# log(Y[,1]) : Log-transformation may help if data is skewed bcz the BEAST model
assumes Gaussian errors;
# time=dates : Use the 'time' arg to supply the times of individual data points.
Alternatively, the `beast123' function also handles date strings of different formats
# deltat=1/12: Aggregate the daily time series into a regular one at the interval
of 1 month=1/12 year
# freq=12 : The period is 1 year, which is 12 data points (1.0/deltat=12)
# tseg.min: The minimum trend segment length allowed in the changepoint detection is 3 data points (3 months)
-- the results MAY be sensitive to this parameter
# sseg.min: The minimum seasonal segment length allowed in the changepoint detection is 6 data points (6 months)
-- the results MAY be sensitive to this parameter
plot(o)
# For sure, the daily time series can be re-sampled/aggregated to a different time interval
# Below, it is aggregated into a half monthly time series (dT=1/24 year), and the number
# of data points per period is freq=1.0 year/dT=24
o = beast.irreg( log(Y[,1]),time=dates, deltat = 1/24, freq=24, tseg.min=12, sseg.min=12)
plot(o)
# Aggregated to a weekly time series (e.g., dT=7 / 365 year: the unit again is year),
# and freq=1 year/ dT = 365/7.
# tcp.minmax=c(0,10) : the min and max numbers of changepoints allowed in the trend component
o = beast.irreg( log(Y[,1]),time=dates,deltat=7/365,freq=365/7, tcp.minmax=c(0,15),tseg.min=5, sseg.min=20,ocp=10)
plot(o)
# Finally if you want to run on the daily interval. Specify the dT=deltat=1/365 year, and
# freq = period/dT= 1.0 year/(1/365)year =365. Bcz the raw data is daily,
# the majority of the raw data is kept intact during the aggregation except when
# there is a leap year (the last two days of the leap year are merged into a single day)
o = beast.irreg( log(Y[,1]),time=dates, deltat = 1/365, freq=365/1, tseg.min=30, sseg.min=180)
plot(o)
By default, a time series is decomposed as Y= season + trend + error
, but for your dataset in the original scale (e.g., not log-tranformed), there could be some spikes. One way to model this is to add an extra spike/outlier component: Y=season+trend+outlier/spike-like+error
# Use 'ocp' to specify the maximum number of spikes (either upward or downward) to be allowed in the outlier/spike component
o = beast.irreg(Y[,1],time=dates, deltat = 1/365, freq=365/1, tseg.min=30, sseg.min=180, ocp=10)
plot(o)
Below is an example for one time series analyzed at the weekly interval (Again, the exact results vary, depending on the choices of tseg.min or sseg.min).

More important, another issue I noticed from your figure is that your data seem to have lots of missing values, which should be assigned NA but instead assigned zeros in your figure. If that is the case, the analysis result for certain would be wrong. BEAST can handle missing data and these missing values should be given NA or NAN (e.g., Y[Y==0]=NA
).