No much resources are available online. But i wanted to create a data lineage system on data sourcing from yugabyte db thru Apache Atlas . Any pointers are appreciated .
For e.g. Below is the process that i have
[TABLE A] --python function--> [TABLE B] --> [report x]
Lets say both table a and b are from yugabyte db.
The python function aggregates the data from table a and insert into table b. An report x will be created on the table b.
If i wanted to create lineage on Atalas for this process. I understand that I will have to create 4 entity. 2 table entity and 2 process entity. Then i will have to build relationship between them but what i am not sure if any new data that comes tomorrow how will that get reflected into Atlas.