Faced with the problem of authentication in the kafka topic using SSL from spark-streaming.
I've got 3 ssl certs in pem format for authentication in the kafka topic:
- ssl_cafile
- ssl_certfile
- ssl_keyfile.
In kafka-python I'm using them in such way:
producer = KafkaProducer(bootstrap_servers=hostname,
security_protocol=security_protocol,
ssl_check_hostname=False,
ssl_cafile='./certs_kafka/CARootChain.pem',
ssl_certfile='./certs_kafka/certificate.pem',
ssl_keyfile='./certs_kafka/key.pem')
How can I use them in spark-streaming to auth my topic?
Looking up in google for the answer I've found such example for ssl auth:
df = spark \
.readStream \
.format("kafka") \
.option("kafka.bootstrap.servers", "server:port") \
.option("subscribe", "z-main-like-test-topic") \
.option("startingOffsets", "earliest") \
.option("failOnDataLoss", "false") \
.option("kafka.security.protocol", "SSL") \
.option("kafka.ssl.enabled.protocols", "TLSv1.2") \
.option("kafka.ssl.truststore.location", truststore) \
.option("kafka.ssl.truststore.type", "JKS") \
.option("kafka.ssl.truststore.password", truststore_password) \
.option("kafka.ssl.keystore.location", keystore) \
.option("kafka.ssl.keystore.type", "JKS") \
.option("kafka.ssl.keystore.password", keystore_password) \
.load()
But I really don't understand what should be put in the keystore and what in the trustore. Maybe someone faced with ssl auth in spark-streaming and can explain which pem to combine with which to make a keystore/trustore jks...