1

We are developing a custom board based on Cyclone V. It is a FPGA+ARM Soc running embedded Linux kernel 3.10-ltsi. Our intended application is to send over a chunk of raw huge data reside in memory, in the range of 50-400MB, to a Java client running on Windows 7, via TCP gigabit ethernet. iperf shows that our board's TCP throughput is in the range of 6xxMBit/s. Questions: 1.We have a requirement where we need to send over the raw memory data within certain interval. So what is the proper way to measure the throughput for our case? Currently we are just wrapping the sending code with gettimeofday like this:

int total_sent = 0, bytes_sent = 0;
gettimeofday(&t0, 0);
for (total_sent = 0; total_sent < data_size;) {
    bytes_sent = write(conn_fd, buf + total_sent, data_size - total_sent);
    if (bytes_sent == -1)
        break;
    total_sent += bytes_sent;
}
gettimeofday(&t1, 0);

unsigned long elapsed_us = (t1.tv_sec - t0.tv_sec) * 1000000 + t1.tv_usec - t0.tv_usec;
double elapsed_s = (double)elapsed_us / 1000000;
printf("Throughput: %f Mbit/s\n", img_size * 8 / elapsed_s / 1000000);
printf("Total bytes sent: %d\n", total_sent);

Is this a correct method to measure the throughput?

2.Is it possible to increase the throughput via two ethernet ports? Something like slicing the raw data into two parts and send it over two ports.

3.What is the best method to increase the throughput in our case? The maximum throughput we would like to achieve is 1024MBit/s.

czteoh
  • 85
  • 6
  • 2
    For once, you simply can't reach 1Gbps because the protocols (ethernet, IP, TCP) have overhead. You should be able to get very close though, but if you really want high throughput you might want to change to UDP instead of TCP, but then you have to implement a light-weight TCP-like protocol yourself to handle packet reordering/losses. Also note that 1Gbps is not 1024Mbps, it's 1000Mbps. – Some programmer dude Dec 18 '14 at 05:57
  • I understand that 1Gbps is 1000Mbps. The 1024Mbit/s that I mentioned about is the worst case for our throughput constraint. By using UDP, is it possible to achieve throughput more than 1Gbps? – czteoh Dec 18 '14 at 08:26

1 Answers1

0
  1. A couple of comments: the overhead of gettimeofday() system call skews your measurements.

  2. Make sure the Ethernet port driver is NAPI enabled.

  3. If you want max throughput, try to get to zero copy. If you're stuck with TCP, maybe you can do something using vmsplice() (see: vmsplice() and TCP).

  4. For best results, dump TCP, use a packet socket with PACKET_MMAP option (http://blog.superpat.com/2010/06/01/zero-copy-in-linux-with-sendfile-and-splice/) and implement Reliable UDP protocol (e.g. https://bitsecant.googlecode.com/svn-history/r8/trunk/src/net/rudp/ReliableServerSocket.java for a JAVA implementation for the Win 7 peer).

Good luck

Community
  • 1
  • 1
gby
  • 14,900
  • 40
  • 57
  • I will look into your suggestions. Regarding the zero copy, AFAIK, we need file descriptor in order to use sendfile() and splice() right? Because in my case, the data is already inside the memory. BTW, what is the proper/accurate method to measure the throughput for my case? – czteoh Dec 18 '14 at 08:34
  • vmsplice() is "splice from virtual memory" - exactly what you need. As for measurements, if the data you are sending is big enough the overhead of gettimeofday() probably is negligible in your case anyway, but in other cases I'd use a cycle counter of your platform - something that doesn't require a system call and kernel context switch. Again, in this particular case, if the send look is running for many iterations it probably doesn't matter anyway – gby Dec 18 '14 at 13:34