1

I need to specify kafka_cluster_ids, and when testing strings with the org.apache.kafka.common.Uuid class' fromString method, I noticed that the last character of the resulting Uuid differs from the input string, but if you use the string returned by Uuid.toString to create another Uuid, the input and output are the same.

import org.apache.commons.text.CharacterPredicates;
import org.apache.commons.text.RandomStringGenerator;
import org.apache.kafka.common.Uuid;

public class Main {

    public static void main(String[] args) {
        RandomStringGenerator rsg = new RandomStringGenerator.Builder()
                .withinRange('0', 'z')
                .filteredBy(CharacterPredicates.LETTERS, CharacterPredicates.DIGITS)
                .build();
        String unique = rsg.generate(22);
        System.out.println(unique);  // prints MHAQTegTj687qceJKFx0gB
        Uuid id = Uuid.fromString(unique);
        System.out.println(id); // prints MHAQTegTj687qceJKFx0gA

        Uuid id2 = Uuid.fromString(id.toString());
        System.out.println(id2); // prints MHAQTegTj687qceJKFx0gA
    }
}

It's a minor thing, but it bugs me a bit. I'd like to know why that last character is changing. Thanks in advance!

Edit: For reference, the relevant source code from Uuid:

    /**
     * Returns a base64 string encoding of the UUID.
     */
    @Override
    public String toString() {
        return Base64.getUrlEncoder().withoutPadding().encodeToString(getBytesFromUuid());
    }

    /**
     * Creates a UUID based on a base64 string encoding used in the toString() method.
     */
    public static Uuid fromString(String str) {
        if (str.length() > 24) {
            throw new IllegalArgumentException("Input string with prefix `"
                    + str.substring(0, 24) + "` is too long to be decoded as a base64 UUID");
        }

        ByteBuffer uuidBytes = ByteBuffer.wrap(Base64.getUrlDecoder().decode(str));
        if (uuidBytes.remaining() != 16) {
            throw new IllegalArgumentException("Input string `" + str + "` decoded as "
                    + uuidBytes.remaining() + " bytes, which is not equal to the expected 16 bytes "
                    + "of a base64-encoded UUID");
        }

        return new Uuid(uuidBytes.getLong(), uuidBytes.getLong());
    }

    private byte[] getBytesFromUuid() {
        // Extract bytes for uuid which is 128 bits (or 16 bytes) long.
        ByteBuffer uuidBytes = ByteBuffer.wrap(new byte[16]);
        uuidBytes.putLong(this.mostSignificantBits);
        uuidBytes.putLong(this.leastSignificantBits);
        return uuidBytes.array();
    }

quantumferret
  • 473
  • 6
  • 12

1 Answers1

1

Apache kafka-client Uuid input differs from output (usage of toString() or fromString() methods)

Different UUID formats in short, depending to the version of used artifact in Kafka prior to 2.6.x the toString() method of the UUID class returned a string in the format xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, where each x represents a hexadecimal digit. Which is most common UUID format in my experiences. However, from Kafka 2.6.x, the toString() method of the UUID class returns a string in the format xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx where each x represents a hexadecimal digit. This format is known as the canonical UUID format, how convert them ? see here.

Lunatic
  • 1,519
  • 8
  • 24