2

I am using kafka-python 2.0.1 for consuming avro data. Following is the code I have tried:

from kafka import KafkaConsumer
import avro.schema
from avro.io import DatumReader, BinaryDecoder
import io

schema_path="schema.avsc"
schema = avro.schema.parse(open(schema_path).read())
reader = DatumReader(schema)


consumer = KafkaConsumer(
        bootstrap_servers='xxx.xxx.xxx.xxx:9093',
        security_protocol='SASL_SSL',
        sasl_mechanism = 'GSSAPI',
        auto_offset_reset = 'latest',
        ssl_check_hostname=False,
        api_version=(1,0,0))

consumer.subscribe(['test'])

for message in consumer:
       message_val = message.value
       print(message_val)
       bytes_reader = io.BytesIO(message_val)
       bytes_reader.seek(5)    
       decoder = avro.io.BinaryDecoder(bytes_reader)    
       record = reader.read(decoder)
       print(record)

I am getting following error:

avro.io.SchemaResolutionException: Can't access branch index 55 for union with 2 branches Writer's Schema: [ "null", "int" ] Reader's Schema: [ "null", "int" ]

Can anyone please suggest what can be the possible cause of this error? I already followed this thread to skip initial 5 bytes:

How to decode/deserialize Avro with Python from Kafka

James Z
  • 12,209
  • 10
  • 24
  • 44
sgmbd
  • 493
  • 1
  • 6
  • 16
  • 1
    Anyone who can help on this issue? I am not getting what this error actually means. avro.io.SchemaResolutionException: Can't access branch index 55 for union with 2 branches ?? – sgmbd Jul 12 '20 at 16:47

2 Answers2

0

I got it working. Issue was with the wrong schema being referred. Thanks.

sgmbd
  • 493
  • 1
  • 6
  • 16
0

So I just faced this issue (though I use SchemaRegistry, but overall - same problem) too and here's my take of the problem.
After some debugging I found out that in my case this was raised when the message from the producer was created using an older (and different) version of the schema. (seems like this is the case for the OP too - he's probably used different schemas for the produced message and for the consumer).
So basically the consumer gets the message and also gets the latest schema. If those two mismatch it'll raise such error.
In my case I just handled the exception because I understood that those were old and irrelevant messages.

v100ev
  • 176
  • 5