Questions tagged [apache-kafka-connect]

Apache Kafka Connect is a tool for scalable and reliable streaming data between Apache Kafka and other data systems.

Apache Kafka Connect is a tool for scalable and reliable streaming data between Apache Kafka and other data systems.

It was first released with Kafka 0.9. It allows to import data from external systems (e.g., databases) into Kafka and it allows to export data from Kafka into external system (e.g., Hadoop). Apache Kafka Connect is a framework that supports a plug-in mechanism allowing to provide custom connectors for your system of choice.

Documentation

3693 questions
58
votes
5 answers

Kafka Connect running out of heap space

After starting Kafka Connect (connect-standalone), my task fails immediately after starting with: java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57) at…
Robin Daugherty
  • 7,115
  • 4
  • 45
  • 59
48
votes
3 answers

Kafka Connect JDBC vs Debezium CDC

What are the differences between JDBC Connector and Debezium SQL Server CDC Connector (or any other relational database connector) and when should I choose one over another, searching for a solution to sync between two relational databases? Not…
Ofek Hod
  • 3,544
  • 2
  • 15
  • 26
31
votes
2 answers

Ideal value for Kafka Connect Distributed tasks.max configuration setting?

I am looking to productionize and deploy my Kafka Connect application. However, there are two questions I have about the tasks.max setting which is required and of high importance but details are vague for what to actually set this value to. If I…
29
votes
4 answers

Kafka : Use common consumer group to access multiple topics

Our cluster runs Kafka 0.11 and has strict restrictions on using consumer groups. We cannot use arbitrary consumer groups so Admin has to create required consumer groups. We run Kafka Connect HDFS Sinks to read data from topics and write to HDFS.…
Ashika Umanga Umagiliya
  • 8,988
  • 28
  • 102
  • 185
26
votes
1 answer

Kafka connect: The configuration XXX was supplied but isn't a known config in AdminClientConfig

When starting Kafka-Connect, I saw lots of warnings 10:33:56.706 [DistributedHerder] WARN org.apache.kafka.clients.admin.AdminClientConfig - The configuration 'config.storage.topic' was supplied but isn't a known config. 10:33:56.707…
Holm
  • 2,987
  • 3
  • 27
  • 48
26
votes
1 answer

How to run a Kafka connect worker in YARN?

I'm playing with Kafka-Connect. I've got the HDFS connector working both in stand-alone mode and distributed mode. They advertise that the workers (which are responsible for running the connectors) can be managed via YARN However, I haven't seen…
hba
  • 7,406
  • 10
  • 63
  • 105
24
votes
3 answers

Make Kafka Topic Log Retention Permanent

I am writing log messages into a Kafka Topic and I want the retention of this topic to be permanent. I have seen in Kafka and Kafka Connect (_schemas, connect-configs, connect-status, connect-offsets, etc) that there are special topics that are not…
user1077071
  • 901
  • 6
  • 16
  • 29
23
votes
2 answers

Kafka Connect S3 Connector OutOfMemory errors with TimeBasedPartitioner

I'm currently working with the Kafka Connect S3 Sink Connector 3.3.1 to copy Kafka messages over to S3 and I have OutOfMemory errors when processing late data. I know it looks like a long question, but I tried my best to make it clear and simple to…
raphael
  • 623
  • 5
  • 9
21
votes
4 answers

Kafka Connect - Failed to flush, timed out while waiting for producer to flush outstanding messages

I am trying to use the Kafka Connect JDBC Source Connector with following properties in BULK…
David
  • 251
  • 1
  • 2
  • 5
20
votes
2 answers

Kafka Connect - How to delete a connector

I created a cassandra-sink connector after that I made some changes in connector.properties file. After stopping the worker and starting it again, now when I add the connector using: java -jar kafka-connect-cli-1.0.6-all.jar create…
el323
  • 2,760
  • 10
  • 45
  • 80
19
votes
3 answers

"The $changeStream stage is only supported on replica sets" error while using mongodb-source-connect

I get an error when running kafka-mongodb-source-connect I was trying to run connect-standalone with connect-avro-standalone.properties and MongoSourceConnector.properties so that Connect write data which is written in MongoDB to Kafka topic. This…
Jaeho Lee
  • 463
  • 1
  • 5
  • 14
18
votes
4 answers

Restart kafka connect sink and source connectors to read from beginning

I have searched quite a lot on this but there doesn't seems to be a good guide around this. From what I have searched there are a few things to consider: Resetting Sink Connector internal topics (status, config and offset). Source Connector offsets…
el323
  • 2,760
  • 10
  • 45
  • 80
17
votes
1 answer

Kafka-connect, Bootstrap broker disconnected

Im trying to setup Kafka Connect with the intent of running a ElasticsearchSinkConnector. The Kafka-setup, consisting of 3 brokers secured using Kerberos, SSL and and ACL. So far Ive been experimenting with running the connect-framework and the…
Jiinxy
  • 559
  • 2
  • 5
  • 17
17
votes
3 answers

How to handle backpressure in a Kafka Connect Sink?

We build a custom Kafka Connect sink which in turn calls a remote REST API. How do I propagate backpressure to the Kafka Connect infrastructure, so put() is called less often in cases when the remote system is slower than the internal consumer…
Chris W.
  • 2,266
  • 20
  • 40
16
votes
7 answers

Kafka Connect Distributed mode The group coordinator is not available

I have been trying this for two weeks now, I am running Kafka cluster on separate machines than my connect nodes. I am unable to get connect running properly. I can read and write to kafka no issue. Zookeeper seems to be running fine. I launch…
ldrrp
  • 666
  • 1
  • 7
  • 24
1
2 3
99 100