5

I was wondering if it is possible/how easy it would be to implement a TFX pipeline (on a real dataset, with 100+ GB dataset, not a tutorial with a small dataset) in AWS?

For the orchestration, I might use Kubeflow. But I suppose, the major issue would be setting up a proper scalable runner for the Apache Beam. I am thinking of using Apache Flink for that.

Anyone with experience doing it? How would you go about putting a TF in production in AWS in general when you need to train the model on a regular basis on new data, do you write the pipeline from scratch or use some tool?

  • Did you solve this problem? How did you resolve this issue? Please share some deatils. – Raj Jun 23 '22 at 23:20

0 Answers0