Understanding Kafka Partition Metadata

Question

I am usign kafka-node within a NodeJs application to create topics via the loadMetadataForTopics option. I want my application to dynamically understand the number of partitions avaiable so that it can properly distribute messages across those partitions.

In a single node Kafka instance the method is creating the topics and returning metadata like this:

  "step1_channelOut": {
    "0": {
      "topic": "step1_channelOut",
      "partition": 0,
      "leader": 1,
      "replicas": [
        1
      ],
      "isr": [
        1
      ]
    }
  },

However in a three node cluster, the method creates more entries:

{
    "0": {
        "topic": "step1_channelOut",
        "partition": 0,
        "leader": 3,
        "replicas": [
            3,
            2,
            1
        ],
        "isr": [
            3,
            2,
            1
        ]
    },
    "1": {
        "topic": "step1_channelOut",
        "partition": 1,
        "leader": 1,
        "replicas": [
            1,
            3,
            2
        ],
        "isr": [
            1,
            3,
            2
        ]
    },
    "2": {
        "topic": "step1_channelOut",
        "partition": 2,
        "leader": 2,
        "replicas": [
            2,
            1,
            3
        ],
        "isr": [
            2,
            1,
            3
        ]
    },
    "3": {
        "topic": "step1_channelOut",
        "partition": 3,
        "leader": 3,
        "replicas": [
            3,
            1,
            2
        ],
        "isr": [
            3,
            1,
            2
        ]
    }
}

In this case did it create 4 partitions? It looks like it to me - since this is just a last case scenario (really set the partitions explicitly) I dont really care what it does so long as it is predictable. THat said the more control I have the better.

What is the relationship between the topic information in zookeeper versus that on the kafka server? Is there a bettet way to manipulate (create / configure topics) the kafka cluster via nodejs?

WHy four partitions? I could understand three, or one, but four?

score 0 · Accepted Answer · edited May 23 '17 at 11:58

The way kafka-node works, it creates topics based on your global Kafka configuration found in server.properties. Check the values of:

num.partitions=12
default.replication.factor=1

There's no automatic relation between the number of brokers and the number of partitions. You can have a 100 broker setup but only want 1 partition for topics, or you can have a single broker setup with 1,000 partitions. They are not related.

There's no non-Java API for creating topics -- at least not yet. See my previously unanswered question here.

If you want more control over how topics are created, but you still want to do it with kafka-node, you are going to have to exec a command like this:

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 12 --topic <topic_name_here>

I do that in node with:

const exec = require('child_process').exec;

function createTopic(topic, replFactor, numPartitions, cb) {
  var zkHost = "localhost:2181";
  var kafkaHome = "/usr/local/kafka";

  exec(
    `${kafkaHome}/bin/kafka-topics.sh --create --zookeeper ${zkHost} -- replication-factor ${replFactor} --partitions ${numPartitions} --topic ${topic}`,
    (error,stderr,stdout) => cb(topic)
  );
}

and there it is! num-partitions=4 - I never thought to look there. Thank you. I am trying to avoid installing the kafka tools on my clients - but yeah - I had also thought of your exec approach. I think I willbe fine with the auo-create as a catch-all, but really I want to focus on setting the topics up manually ahead of time. THank you again — akaphenom, May 29 '16 at 16:48

Understanding Kafka Partition Metadata

1 Answers1