2

my question is split to two. I've read Kafka Connect - Delete Connector with configs?. I'd like to completely remove a connector, with offsets and all, so I can recreate it with the same name later. Is this possible? To my understanding, a tombstone message will kill this connector indefinitely.

The second part is - is there a way to have the kafka-connect container automatically delete all connectors he created when bringing it down? Thanks

Omri. B
  • 375
  • 1
  • 13

1 Answers1

2

There is no such command to completely cleanup connector state. For sink connectors, you can use kafka-consumer-groups to reset it's offsets. For source connectors, it's not as straightforward, as you'll need to manually produce data into the Connect-managed offsets topic.

The config and status topics also persist historical data, but shouldn't prevent you from recreating the connector with the same name/details.

The Connect containers published by Confluent and Debezium always uses Distributed mode. You'll need to override the entrypoint of the container to use standalone mode to not persist the connector metadata in Kafka topics (this won't be fault tolerant, but it'll be fine for testing)

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • So it's basically impossible to reuse the same connector name with a source connector? – Omri. B Nov 17 '22 at 13:59
  • Not impossible, just more difficult than a simple HTTP call. There is a JIRA, I think, to fully cleanup connector state on DELETE calls. – OneCricketeer Nov 17 '22 at 21:47
  • Thanks, I'll look for the JIRA and watch it. So, in order to delete it completely for now - will deleting the `config.storage.topic` and `offset` topic do? (source connector). I thought about using distributed in a standalone manner, so defining a different `group.id` for each connector. Is that something you think will work? – Omri. B Nov 20 '22 at 07:14
  • You don't need to touch the config topic. If you use standalone, none of those topics are even used. You should be able to set same `group.id` in Connect workers since it'll just create a networked cluster in any mode – OneCricketeer Nov 20 '22 at 13:23
  • I'm using confluent images so thought I'd be better off using the distributed mode but separating clusters, gives me the ability to keep the actual distributed option open ;) – Omri. B Nov 20 '22 at 13:51
  • Downside with separate clusters is that they each should ideally use separate config/offset/status topics – OneCricketeer Nov 20 '22 at 14:04
  • I don't really have a problem with this - I will have a couple hundred clusters, max. I've read that kafka can handle thousands of topics, so is there another issue I'm not thinking of? – Omri. B Nov 20 '22 at 14:42