1

I am running development environment for Confluent Kafka, Community edition on Windows, version 3.0.1-2.11. I am trying to achieve load balancing of tasks between 2 instances of connector. I am running Kafka Zookepper, Server, REST services and 2 instance of Connect distributed on the same machine. Only difference between properties file for connectors is rest port since they are running on the same machine. I don't create topics for connector offsets, config, status. Should I? I have custom code for sink connector.

When I create worker for my sink connector I do this by executing POST request

POST http://localhost:8083/connectors

toward any of the running connectors. Checking is there loaded worker is done at URL

GET http://localhost:8083/connectors

My sink connector has System.out.println() lines in code with which I can follow output of my code in the console log. When my worker is running I can see that only one instance of connector is executing code. If I terminate one connector another instance will take over the worker and execution will resume. However this is not what I want. My goal is that both connector instances are running worker code so that they can share the load between them. I've tried to got over some open source connectors to see is there specifics in writing code of connectors but with no success.

I've made some different attempts to tackle this problem but with no success. I could rewrite my business code to come around this but I'm pretty sure I'm missing on something not obvious for me. Recently I commented on Robin Moffatt's answer of this question.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Miki
  • 2,493
  • 2
  • 27
  • 39

1 Answers1

2

From the sounds of it your custom code is not correctly spawning the number of tasks that you are expecting.

  • Make sure that you've set tasks.max >1 in your config
  • Make sure that your connector is correctly creating the appropriate number of tasks to taskConfigs

References:

Robin Moffatt
  • 30,382
  • 3
  • 65
  • 92
  • tasks.max is always set to 10, just to be sure ;) Your second bullet point does seems to be a problem. I'll take some time to check through the links you've pointed out to see what am I doing wrong – Miki Sep 10 '19 at 12:45
  • 1
    Well, it turns out badly coded taskConfigs method you've pointed out was problem maker. Basically only one worker was generated. Rufus Nash blog with explanation about Venafi connector with only one worker was more than helpful. Thanks a lot! – Miki Sep 10 '19 at 14:29
  • I am running Apache Kafka on my Windows machine with two Kafka-Connect-Workers(Port 8083, 8084) and three partitions(replication of one). I am able to see the fail-over to other Kafka-Connect worker whenever I shutdown one of them but load balancing is not happening because the number of tasks is always ONE. I am using Official MongoDB-Kafka-Connector(ChangeStream) with tasks.max=6. Even under higher volume of data, tasks count remain one. What am I missing here? How do I know only one task is running? "/connectors/mongodb-connector/status": shows single task in tasks array. – Hamid Jul 06 '20 at 16:24
  • Hi Robin, I have started a new question. Link below: https://stackoverflow.com/questions/62761101/distributed-kafka-connect-with-multiple-tasks-not-working Could you please guide? – Hamid Jul 06 '20 at 17:13