1. In my application which sends a data through TCP connection (Kafka Producer), I observed drastic performance drop when the message size gets larger from 1MB to 100MB. (140 MB/sec --> 25 MB/sec) (batch size = 1)
I profiled the producer process and found one suspicious point: a method 'copyFromArray' in Bits.java consumes most of the time. (The codes are as follows.)
static final long UNSAFE_COPY_THRESHOLD = 1024L * 1024L;
static void copyFromArray(Object src, long srcBaseOffset, long srcPos,
long dstAddr, long length)
{
long offset = srcBaseOffset + srcPos;
while (length > 0) {
long size = (length > UNSAFE_COPY_THRESHOLD) ? UNSAFE_COPY_THRESHOLD : length;
unsafe.copyMemory(src, offset, null, dstAddr, size);
length -= size;
offset += size;
dstAddr += size;
}
}
Reference: http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7u40-b43/java/nio/Bits.java
2. Interestingly this problem occurs only when I use the producer client (java implementation) but does not occur when I use the one (scala implementation), which I cannot understand.
Where should I start to find what the problem is here?