I can use KafkaTool and kafka-console-consumer
to view data from the __consumer_offset
topic, but I can't figure out how to parse the data in python if I read it directly with my own custom tool. Even when using KafkaTool, I can't decipher the key and value perfectly, there are odd characters that don't seem to follow any pattern. I think it has to do with the way Scala marshals the data into the raw bytes.
Here's the key format: [short: version] [string: group] [string: topic] [int32: partition] which can be gotten from https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/coordinator/group/GroupMetadataManager.scala - This assumes version 0 which mine is.
Here's an example key in hex format: 00 01 00 16 63 6F 6E 73 6F 6C 65 2D 63 6F 6E 73 75 6D 65 72 2D 39 37 30 38 32 00 0D 73 74
61 67 69 6E 67 2D 73 70 65 6E 64 00 00 00 26
Now going through those bytes -
00
- version 0
01 00
- start-of-heading, null … okay, makes sense but other messages begin with 02 00
16 63 6F 6E 73 6F 6C 65 2D 63 6F 6E 73 75 6D 65 72 2D 39 37 30 38 32
- Looks like good data
00 0D
- Null, carriage return … okay makes sense but others have 00 0C
73 74
61 67 69 6E 67 2D 73 70 65 6E 64
- good data (“staging-spend”)
00 00 00 26
- I guess this is the end of the string plus partition in which case 00 00
denotes the end of the string??
Similar issues/inconsistencies with the message. How exactly is the data formatted so I can parse it into string values?