We are looking to build a single pipeline within a code repository that cleans, harmonizes, and transforms data to features of interest. We would like to apply that single pipeline code on different inputs and then test how the outputs look.
For example, we would like to test the pipeline on synthetic data, version 1 of 'real' data that includes only retrospective data, and version 2 of 'real' data that includes retrospective and prospective data. The comparison of the outputs could be what percent of patients had diabetes in version 1 compared to version 2.
I saw that you could template code repositories in foundry. Is this a viable option? Could you template your code repository and apply to the three scenarios I have provided? Is there a better option?