I'm working on creating event streams using Outbox pattern. I would like to know why would one go for Outbox pattern instead of using CDC on the required tables?
Pros of using CDC directly:
- The streams will always be in order because it does not matter when one introduced event capturing as the connector takes a snapshot of all the existing data and starts capturing events henceforth.
- It does not require application changes. Application can continue to work as is without any code changes.
Cons:
- Need to parse the db event manually(or using some existing parser class like the one available for outbox events).
- Does not filter out unnecessary events. Eg. if a record changes 100 times, but only the initial and final state is required, still all 100 events will be emitted. Selective writing to outbox alleviates this problem.
On further reading, one point that came up was that it separates db design from message contract. However, the downside that's bothering me is that outbox works from the day the code goes live. For all previous events, they need to be replayed and ingested into the outbox, which breaks the order of the stream as older events will be portrayed as latest events in the outbox, something one doesn't have to worry about when using CDC directly.
Any insights on what the efficient approach here is?