I'm working with Pentaho Kettle (PDI) and i'm trying to manage a flow in where there are a few transformations which should work like those where functions. I'll be more specific. I've created some transformation that make some modify on a few fields of some csv file. Every transformation acts just on one field of the csv file. So the first transformation should modify values, for example, just from the first column of the file, the second transformation should works on another column, and so on. Since a spent time creating every single transformation, i would like to have those reusable for other jobs/transformation working with the same kind of values. If you want an example i've created a tranformation which make quality improvement on phone numbers (and many others).
Here's a "general" idea of a main job:
My problem here is about passing data trough the transformations. To do that, every time, i put data in the result table, using the "Copy rows from result" step. After having done all the modify i put data in the result table using the "Put rows to result" step. Here just a sample (of course the real transformations are more complicated than this one).
As you probably know, we have to specify the coming fields in the "Copy rows from result", so if i have to use this transformation in another job/transformation which works with differet file i have to change the schema of the "Copy rows from result" step.
May be there's a different way to move the data flow, which could be easier than this. I've also considered the use of parameters, but i don't know if it's possible to pass them, using fields coming from the result tables. And here's the other question: "is the result table the only way to return values from a transformation ?"
I've also considered to execute all the transformation in parallel, inside of a transformation, passing them just the interested value and a key, and then to fuse all single fields with a "merge join step". This one as also a synchronization problem. So there's anyone who knows a good way to solve this problem ? ... i think that it exist a standard method to do all this ...