I want to introduce data quality testing (empty fields/max-min values/regex/etc...) into my pipeline which will essentially consume kafta topics testing the data before it is logged into the DB.
I am having a hard time choosing between the Deequ and Great Expectations frameworks. Deequ lacks clear documentation but has "anomaly detection" which can compare previous scans to current ones. Great expectations has very nice and clear documentation and thus less overhead. I think neither of these frameworks is made for data streaming specifically.
Can anyone offer some advice/other framework suggestions?