0

I am looking for kafka-connect connector that will write from kafka to local file system in parquet file format. I don't want to use the hdfs or s3 sink connector for the same.

Aman Jain
  • 2,975
  • 1
  • 20
  • 35

1 Answers1

1

format.class=ParquetFormat only exists in the mentioned connectors

You can use file:// prefix to write to local disk, or use a project like MinIO to reproduce self-hosted S3

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • Will this work without specifying hdfs server details if I use hdfs connector or without s3 host details if I use s3 connector. – Aman Jain Apr 07 '22 at 13:48
  • Rather than a namenode, you can set `store.uri` as a `file://` path; I don't think you need any other configs like Hadoop XML files (it's been a while since I tried it). For the S3 connector, you do need those details. Like I said, you can use MinIO, for example - https://blog.min.io/kafka_and_minio/ – OneCricketeer Apr 07 '22 at 17:07
  • 1
    So I have used minio, able to write data to minio using kafka-connect-s3 connector. – Aman Jain Apr 20 '22 at 17:39
  • I have asked another question related to hdfs, see if you can help me there as well https://stackoverflow.com/questions/71943956/not-able-to-connect-to-hadoop-server-using-hadoopfilesystem-in-pyarrrow – Aman Jain Apr 20 '22 at 18:09