I want to run a machine learning model with some data. Before train the model with this data I need to process it, so I have been reading some ways to do it.
First of all create a Dataflow pipeline to upload it to Bigquery or Google Cloud Storage, then create a data pipeline with Google Dataprep to clean it.
The other way I reat to do it is with Data Fusion, that can create data pipelines more easier, but I don't know and here is my doubt, data Fusion it is only to create a pipeline like Dataflow and then I have to use DataPrep to clean the data or if Data Fusion can clean the data and prepare it to put into my machine learning model.
If Data Fusion can clean the data as DataPrep, when I should use DataPrep?