0

I have a dockerized Spark app for simple streaming. The listener generates random numbers and sends them to Kafka using this code:

producer = KafkaProducer(bootstrap_servers=kafka_brokers, api_version=(0, 10, 1))
while True:
    data = //generate a json with a single number
    producer.send(topic_name, str.encode(json.dumps(data)))

Then I try to read this data using the consumer as such:

consumer = KafkaConsumer(topic_name, bootstrap_servers=['192.168.99.100:9092'])
for message in consumer:
    record = json.loads(message.value)
    list.append(record['field'])

When I run the code it never gets past the 'for message in consumer' part. I checked within Kafka and the messages are all there but I cannot access them through Python.

Edit: I am using the bitnami spark containers and this settings for kafka and zookeeper.

I just have two separate files, one for the producer and one for the consumer. I run the producer file which sends to Kafka then I spark-submit the consumer file which should just print a list of the numbers received. For this I simply do spark-submit --master spark://spark:7077 consumer.py

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
ice
  • 39
  • 6

0 Answers0