0

If i understand correctly, the default value for cleanup.policy is delete, which means that old segments will be discarded when their retention time or size limit by configure log.retention.hours and log.retention.bytes.

Q:

Is that correct that in this case can be several messages with the same key in the log with different values? what happens to messages with null value (Tombstones?), this is also the default value?

If so, if I understand correctly then here comes the delete cleanup policy, which defines per-record retention which is configured by delete.retention.ms, this both guarantees last key existence and also will remove all messages with null value?

Q:

  1. If so, what can I still see messages with null value even when delete.retention.ms is configured to be 1ms?

  2. Which configuration should I change in order to delete all messages in the log? is it possible?

  3. Is retention.ms and delete.retention.ms connected on some way? if so - on which retention policy and how?

Thanks!

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Zvi Mints
  • 1,072
  • 1
  • 8
  • 20

1 Answers1

1

which means that old segments will be discarded when their retention time or size limit

Only closed segments, yes (default 1GB large). In other words, a segment of 500MB will remain on the broker for longer than the retention.ms.

[there] can be several messages with the same key in the log with different values

Yes, there can be. All messages with the same key end up in the same partition and null key'd messages are put into differing partitions (by default).


  1. I assume you meant to ask why can you see? Because the LogCleaner runs on a thread interval, and only cleans closed segments. Messages aren't truncated from log-segments "right this millisecond", especially on clusters with TB's worth of data.

  2. If you want to completely delete a topic, and not have to wait for the LogCleaner to do its work, then use the approriate API calls to delete topics, not use retention settings to purge a topic

  3. delete.retention.ms is a setting for compacted topics. retention.ms is for cleaup.policy=delete topics

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • Thanks, @OneCricketeer, can you please edit your answer and provide which configuration should I change in (2.) I no order to purge the topic? also when the LogCleaner is scheduled? regarding "Yes, there can be" - is it can happen on compaction policy? since its should leave only the last one, also its removed keys with null value, can you please elaborate more about the process? – Zvi Mints Aug 25 '21 at 06:44
  • Read the linked post for any relevant configs. Compacted topics cannot have null keys. Compaction only happens on closed log segments, and is a function of the "dirty ratio" property. There's no time guarantee in which the log cleaner threads will run (sometimes they even crash with no logs). You can add more threads with a broker property – OneCricketeer Aug 25 '21 at 06:49