0

I was trying to use ML.NET to do some simple linear regression test using a small dataset (760 items) but when I am calling Fit() to get a transformer it throws an eror "Column 'Time' has values of DateTime, which is not the same as earlier observed type of Single. "

as if the function confused the type of each column in the csv?

the code


        public class PriceData
        {
            [ColumnName("Time")]
            [LoadColumn(0)]
            public DateTime Time;

            [ColumnName("ClosePrice")]
            [LoadColumn(1)]
            public float ClosePrice;

        }
       
       


        public class DemandPrediction
        {
            [ColumnName("Score")]
            public float ApproximationScore;
        }



        static string URL = "https://bittrex.com/Api/v2.0/pub/market/GetTicks?marketName=USD-BTC&tickInterval=day";
        public static void Main(string[] args)
        {
            try
            {
                List<PriceData> priceData = new List<PriceData>();
                DownloadData(URL,ref priceData);

                var context = new MLContext();
                var data = context.Data.LoadFromTextFile<PriceData>("data.csv",hasHeader:false,separatorChar:',');

                var pipeline = context.Transforms.Concatenate(outputColumnName:"PriceData",
                     nameof(PriceData.Time),
                     nameof(PriceData.ClosePrice)
                   ).AppendCacheCheckpoint(context);

               
                var trainerType = context.Regression.Trainers.OnlineGradientDescent(lossFunction: new TweedieLoss());
                var fullPipeline = pipeline.Append(trainerType);
                var model = pipeline.Fit(data);
                var predictions = model.Transform(data);
                var metrics = context.Regression.Evaluate(
                                                            data: predictions,
                                                            labelColumnName: "ClosePrice",
                                                            scoreColumnName: "Score");

                
                var sample = new PriceData { Time =Convert.ToDateTime("7/6/2020 12:00:00 AM"), ClosePrice = 9273.18f };

                // create a prediction engine
                var engine = context.Model.CreatePredictionEngine<PriceData, DemandPrediction>(model);

                // make the prediction
                var prediction = engine.Predict(sample);
                

                Console.ReadKey();

            }
            catch (WebException e)
            {
                Console.WriteLine(e.Message);
            }
        }```
PontiacGTX
  • 185
  • 2
  • 15
  • It may be better to use Time Series forecasting instead of a regression model - https://learn.microsoft.com/en-us/dotnet/machine-learning/tutorials/time-series-demand-forecasting – Jon Jul 06 '20 at 21:16
  • @Jon what difference is there in terms of accuracy? – PontiacGTX Jul 07 '20 at 00:07
  • whenever I try to call the function mlContext.Forecasting.ForecastBySsa it doesnt show up in my ML.NET versio 1.5.0 was there some update removing that function? Edit: https://learn.microsoft.com/en-us/dotnet/api/microsoft.ml.timeseriescatalog.forecastbyssa?view=ml-dotnet – PontiacGTX Jul 07 '20 at 00:27
  • It has it's own NuGet - https://www.nuget.org/packages/Microsoft.ML.TimeSeries/ – Jon Jul 07 '20 at 23:23

1 Answers1

0

If you open the csv up in notepad, what do the Time values look like? Are they integer values? You may need to convert from Integer value to DateTime. See: How do I convert an Excel serial date number to a .NET DateTime?

Jonathan
  • 4,916
  • 2
  • 20
  • 37
  • a single line of csv looks like ```5/31/2018 12:00:00 AM,7560.00``` https://imgur.com/a/VE2Uo5t also the data is recognized as DateTime and Single https://imgur.com/a/nz2dMzf – PontiacGTX Jul 06 '20 at 21:07