I am looking for kafka-connect connector that will write from kafka to local file system in parquet file format. I don't want to use the hdfs or s3 sink connector for the same.
Asked
Active
Viewed 951 times
1 Answers
1
format.class=ParquetFormat
only exists in the mentioned connectors
You can use file://
prefix to write to local disk, or use a project like MinIO to reproduce self-hosted S3

OneCricketeer
- 179,855
- 19
- 132
- 245
-
Will this work without specifying hdfs server details if I use hdfs connector or without s3 host details if I use s3 connector. – Aman Jain Apr 07 '22 at 13:48
-
Rather than a namenode, you can set `store.uri` as a `file://` path; I don't think you need any other configs like Hadoop XML files (it's been a while since I tried it). For the S3 connector, you do need those details. Like I said, you can use MinIO, for example - https://blog.min.io/kafka_and_minio/ – OneCricketeer Apr 07 '22 at 17:07
-
1So I have used minio, able to write data to minio using kafka-connect-s3 connector. – Aman Jain Apr 20 '22 at 17:39
-
I have asked another question related to hdfs, see if you can help me there as well https://stackoverflow.com/questions/71943956/not-able-to-connect-to-hadoop-server-using-hadoopfilesystem-in-pyarrrow – Aman Jain Apr 20 '22 at 18:09