I'm paraphrasing here, from the relevant section of a book, Kafka - Definitive Guide. It'll most likely clear your doubt.
log.retention.bytes : This denotes the total number of bytes of messages retained per partition. So, if we have a topic with 8 partitions, and log.retention.bytes
is set to 1GB, then the amount of data retained for the topic will be 8GB at most. This means if we ever choose to increase the number of partitions for a topic, total amount of data retained will also increase.
log.retention.ms : The most common configuration for how long Kafka will retain messages is by time. The default is specified in the configuration file using the log.retention.hours
parameter, and it is set to 168 hours, or one week. However, there are two other parameters allowed, log.retention.minutes
and log.retention.ms
. All three of these specify the same configuration—the amount of time after which messages may be deleted—but the recommended parameter to use is log.retention.ms
, as the smaller unit size will take precedence if more than one is specified. This will make sure that the value set for log.retention.ms
is always the one used. If more than one is specified, the smaller unit size will take precedence.
Retention By Time and Last Modified Times : Retention by time is performed by examining the last modified time (mtime) on each log segment file on disk. Under normal cluster operations, this is the time that the log segment was closed, and represents the timestamp of the last message in the file. However, when using administrative tools to move partitions between brokers, this time is not accurate and will result in excess retention for these partitions.
Configuring Retention by Size and Time : If you have specified a value for both log.retention.bytes
and log.retention.ms
(or another parameter for retention by time), messages may be removed when either criteria is met. For example, if log.retention.ms
is set to 86400000 (1 day) and log.retention.bytes
is set to 1000000000 (1 GB), it is possible for messages that are less than 1 day old to get deleted if the total volume of messages over the course of the day is greater than 1 GB. Conversely, if the volume is less than 1 GB, messages can be deleted after 1 day even if the total size of the partition is less than 1 GB.