I have tried creating a Kafka topic configuration that uses compaction and deletion, to achieve the following:
- Within the retention period, retain the latest version of the key
- After the retention period, any message older than the timestamp to be removed
For this, I have tried the following topic specific config:
cleanup.policy=[compact,delete]
retention.ms=864000000 (10 days)
min.compaction.lag.ms=3600000 (1 hour)
min.cleanable.dirty.ratio=0.1
segment.ms=3600000 (1 hour)
The broker configuration is as following:
log.retention.hours=7 days
log.segment.bytes=1.1gb
log.cleanup.policy=delete
delete.retention.ms=1 day
When I set this to a smaller amount in test, e.g. 20mins, 1hr etc, I can correctly see the data is pruned after the retention period, only adjusting retention.ms
on the topic.
I can see that the data is correctly being compacted as expected, but after the 10 day retention period if I read the topic from the beginning, data much older than 10 days is still there. Is this a problem with such a long retention period?
Am I missing any configuration here? I have checked the kafka logs and see the broker is rolling the segments and compacting as expected, but can't see anything about deletes?
Kafka Version is 5.1.2-1