11

I've got my Kafka Streams processing configuration for AUTO_REGISTER_SCHEMAS set to true.

I noticed in this auto generated schema it creates the following 2 types

{
      "name": "id",
      "type": {
        "type": "string",
        "avro.java.string": "String"
      }
},

Could someone please explain why it creates 2 types and what exactly "avro.java.string": "String" is.

Thanks

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
userMod2
  • 8,312
  • 13
  • 63
  • 115
  • There's only one type. It's a string, but it's clarified to a specific subclass – OneCricketeer Apr 23 '18 at 14:11
  • 1
    Is this going to be a problem when a Python client tries to work with this schema? – Ryan Apr 28 '20 at 15:44
  • 1
    The behavior of the class generator which modifies schemas by replacing AVRO strings with Java specific logical types is a bug: https://issues.apache.org/jira/browse/AVRO-2838 as it does indeed become a problem with interoperability with Python and other languages – Ryan May 24 '21 at 16:02
  • Confluent has added a property that can be set on the serializer to prevent this from occurring: avro.remove.java.properties – Andrew Kirk Mar 06 '23 at 17:19

1 Answers1

13

By default Avro uses CharSequence for the String representation, the following syntax allows you to overwrite the default behavior and use java.lang.String as the String type for the instances of the fields declared like this

"type": {
        "type": "string",
        "avro.java.string": "String"
      }
Etienne Neveu
  • 12,604
  • 9
  • 36
  • 59
hlagos
  • 7,690
  • 3
  • 23
  • 41