Debezium with outbox pattern
Setting the context:
- Using
- We wanted to use schema registry to store all event schemas for different business entities
- One topic can have multiple version of same schema
- One topic can have entirely different schema bounded by business context. Ex customerCreated, customerPhoneUpdated, customerAddressUpdated. (Using one the subject name strtegies)
- Wanted to verify if debezium supports point 2 and 3 (specially 3).
Imagine, I have two business event customerCreated and orderCreated and I wanted to store both into same topic “com.business.event”.
customerCreated
{ “id”:”244444” “name”:”test”, “address”: “test 123”, “email” : “test@test.com” }
orderCreated
{ “id”:”244444” “value”:”1234”, “address”: “test 123”, “phone” : “3333”, “deliverydate”: “10-12-19” }
Structure of my outbox table is as per below article
https://debezium.io/blog/2019/02/19/reliable-microservices-data-exchange-with-the-outbox-pattern/
Column | Type | Modifiers --------------+------------------------+----------- id | uuid | not null aggregatetype | character varying(255) | not null aggregateid | character varying(255) | not null type | character varying(255) | not null payload | jsonb | not null
Now when I push my business event to above table it will store customerCreated and orderCreated event into the payload column as a String/JSON. If I push this to kafka in a topic “com.business.event” using debezium connector, it will produce the below message. (Printing with schema for example)
customerCreated.json
{
"schema":
{
"type":"struct",
"fields":[
{
"type":"string",
"optional":false,
"field":"eventType"
},
{
"type":"string",
"optional":false,
"name":"io.debezium.data.Json",
"version":1,
"field":"payload"
}
],
"optional":false
},
"payload":
{
"eventType":"Customer Created",
"payload":"{\"id\": \"2971baea-e5a0-46cb-b1b1-273eaf88246a\", \"name\": \"jitender\", \"email\": \"test\", \"address\": \"700 \"}}"
}
}
orderCreated.json
{
"schema":
{
"type":"struct",
"fields":[
{
"type":"string",
"optional":false,
"field":"eventType"
},
{
"type":"string",
"optional":false,
"name":"io.debezium.data.Json",
"version":1,
"field":"payload"
}
],
"optional":false
},
"payload":
{
"eventType":"Order Created",
"payload":"{\"id\": \"2971baea-e5a0-46cb-b1b1-273eaf88246a\", \"value\": \"123\",\"deliverydate\": \"10-12-19\", \"address\": \"test\", \"phone\": \"700 \"}}"
}
}
Problem:
As you can see in above examples schema in schema registry/kafka remains same though payload contains different business entities. Now when I as a consumer goes and tries to deserialise this message, I should know that payload can contain different structure based on the business event they are generated from. In this scenerio, I am not able to utilise schema registry fully as consumer should know all the business entities in advance.
Questions :
- What I wanted to do is that debezium should create two different schema’s under the same topic “com.business.event” using subject name strategy (example below). https://karengryg.io/2018/08/18/multi-schemas-in-one-kafka-topic/
Now as a consumer when I consume the message, my consumer will read the schema id from topic message and get it from schema registry and will decode the message directly with it. After decoding I can ignore the message if I am not interested in business event. By doing this I can have different schema’s under same topic using schema registry.
- Can I control the schema in kafka topic when I use debezium in conjunction with schema registry. Outbox table or outbox pattern is a must.