2

I have data from a web users in Firestore.

I have inserted some of this data in Google BigQuery in order to run a machine learning model.

I have experience in training Machine Learning models, but I don't have experience in obtain the predictions for new data once this model is trained.

I have read that I can upload this trained model in Google cloud storage and then put it in AI Platform, but I don't know the process I have to follow, because new data it is going to be inserted in Bigquery, with this new data I want to make predictions and then pick this predictions and put them in Firstore again.

I think that it could be done with Dataflow (Apache Beam) or Data composer (Airflow) where I can automate this process and schedule it to run all the process every week, but I don't have experience in use this technologies,can anyone recommend me what technology will be better for this particular case to lookup information on how to use it?

One possibility could be save the model in AI platform or in google cloud storage and with cloud functions call this saved model and make predictions to save them in firestore?

J.C Guzman
  • 1,192
  • 3
  • 16
  • 40

3 Answers3

2

Bigquery ML supports external Tensorflow models.

TensorFlow model importing. This feature allows you to create BigQuery ML models from previously-trained TensorFlow models, then perform prediction in BigQuery ML. See the CREATE MODEL statement for importing TensorFlow models for more information.

So what you want to achieve is

  • Get a table in BigQuery
  • Build out a feature set for your model (select statements)
  • CREATE MODEL in BigQuery (rerun this to re-train)
  • Run the ML.PREDICT (or equivalent) to get predictions on new data

As new data arrives into BigQuery you can
- retrain the model (externally or internally depends on type of algorithm you have)
- use the new row in predictions

https://cloud.google.com/bigquery-ml/docs/bigqueryml-intro

Pentium10
  • 204,586
  • 122
  • 423
  • 502
1

For doing this you need 2 services:

  1. One for the prediction which serve your model
  2. One for getting the prediction and storing the result in firestore

Personally, I don't recommend you to store your model in AI-Platform today (a new release should happen by the end of the month, but today, it's no!). I wrote an article for hosting a Tensorflow model in Cloud Run. It should work another framework, but I only had built a tensorflow model, and I used it for my tests.

The best solution if your new data are in BigQuery, and if your model is in tensorflow, is to load you model in BigQuery. The prediction is free of charge, you only pay for the data in your query (I'm also writing an article on this, but I'm waiting the new AI-platform release for providing a correct comparison between both solution).

After getting the prediction, (result of BigQuery + call to Cloud Run OR Result of BigQuery with predict clause), you have to iterate of the results to store them into firestore. I recommend you a batch write to firestore

guillaume blaquiere
  • 66,369
  • 2
  • 47
  • 76
-1

I have read that I can upload this trained model in Google cloud storage

If you want to do this you can use Dataflow. You can write a pipeline that reads data from BigQuery and writes them to GCS.

(I am not sure I understand how you want your job to interact with AI platform and Firestore)

Yueyang Qiu
  • 159
  • 5
  • Because what I have to write from bigquery to firestore are predictions from a trained machine learning model, I was asking how to do it – J.C Guzman Oct 21 '19 at 16:40