How to do time-series simple forecast?

Question

I have a time-series uni-variate data. So just TimeStamp and Value. Now I want to extrapolate(forecast) this Value for next day/month/year. I know there are methods such as Box-jenkins (ARIMA) etc.

Spark has Linear Regression and I tried it, but I did not get satisfactory results. Did anybody tried time-series simple forecast in Spark. Can share their implementation approach?

PS: I check at User Mailing list for this issue, Almost all the questions regarding this issue are unanswered there.

You could elaborate on *"Spark has Linear Regression and I tried it, but I did not get satisfactory results"* - the link to Spark's algorithm + what you've already tried and the results with a note where they don't meet your expectation. — Jacek Laskowski, Feb 01 '15 at 12:04
Please look at the this gist. Feel free to comment there, I have added data and algorithm https://gist.github.com/codeAshu/2ebd84b1b48834fce89b — rusty, Feb 02 '15 at 05:07
I also looked at this question but _"partitionBy"_ is giving me error [http://stackoverflow.com/questions/23402303/apache-spark-moving-average] — rusty, Feb 02 '15 at 06:50

score 2 · Answer 1 · answered Dec 22 '16 at 10:24

Yes I have been already applied ARIMA in spark for uni variate time series.

public static void main(String args[])
{
    System.setProperty("hadoop.home.dir", "C:/winutils");  

     SparkSession spark = SparkSession
              .builder().master("local")
              .appName("Spark-TS Example")
              .config("spark.sql.warehouse.dir", "file:///C:/Users/abc/Downloads/Spark/sparkdemo/spark-warehouse/")
              .getOrCreate();

    Dataset<String> lines = spark.read().textFile("C:/Users/abc/Downloads/thunderbird/Time series/trainingvector_arima.csv");

    Dataset<Double> doubleDataset = lines.map(line>Double.parseDouble(line.toString()),
            Encoders.DOUBLE());

    List<Double> doubleList = doubleDataset.collectAsList();
    //scala.collection.immutable.List<Object> scalaList = new

    Double[] doubleArray = new Double[doubleList.size()];
    doubleArray = doubleList.toArray(doubleArray);

    double[] values = new double[doubleArray.length];
    for(int i = 0; i< doubleArray.length; i++)
    { 
        values[i] = doubleArray[i];
    }

    Vector tsvector = Vectors.dense(values);

    System.out.println("Ts vector:" + tsvector.toString());

    //ARIMAModel arimamodel = ARIMA.fitModel(1, 0, 1, tsvector, true, "css-bobyqa", null);
    ARIMAModel arimamodel = ARIMA.autoFit(tsvector, 1, 1, 1);

    Vector forcst = arimamodel.forecast(tsvector, 10);

    System.out.println("forecast of next 10 observations: " + forcst);
   }

This code works for me. Here any values which you want to forecast pass as input data.

How to do time-series simple forecast?

1 Answers1

Linked