How to utilize 100% of the network bandwidth with sockets?

Question

I have a server and a client. The are working in different servers. Both of the servers have two 1000M network adapters.

I am using tcp blocking socket both in server and client.

Server

Once a socket is accepted, a new thread will be started to process the request. It works like:

while(1) {
    recv();  /* receive a char */
    send();  /* send a line */
}

The client just send a char to the server, server will send a line of text to the client. The length of the text is about 200.

The line has beed loaded into memory in advance.

Client

The client use different threads to connect to the server. Once connected, It will work like:

while(1) {
    send();  /* send a char */
    recv();  /* receive a line and  */
}

Bandwidth usage

When I use 100 threads in client(and more the result is almost the same), I get this network traffic in Server:

tsar -l -i 1 --traffic

the result:

Time              -------------traffic------------
Time               bytin  bytout   pktin  pktout
06/09/14-23:12:56   0.00    0.00    0.00    0.00
06/09/14-23:12:57  63.4M  155.3M  954.6K  954.6K
06/09/14-23:12:58   0.00    0.00    0.00    0.00
06/09/14-23:12:59  60.1M  147.3M  905.4K  905.4K
06/09/14-23:13:00   0.00    0.00    0.00    0.00
06/09/14-23:13:01  57.5M  140.8M  866.5K  866.4K

and sar -n DEV 1:

11:20:46 PM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
11:20:47 PM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
11:20:47 PM      eth0 478215.05 478217.20  31756.46  77744.95      0.00      0.00      0.00
11:20:47 PM      eth1 484318.28 484318.28  32162.05  78724.16      0.00      0.00      1.08
11:20:47 PM     bond0 962533.33 962535.48  63918.51 156469.11      0.00      0.00      1.08

Question:

In theory, the max value of (bytin + bytout) could be 256M. How can I archive that?

Any help will be great, thank in advance.

Probably a Linux question. If it is, please tag it as such. And show the exact code... — Basile Starynkevitch, Sep 06 '14 at 15:54

score 4 · Accepted Answer · edited May 23 '17 at 12:16

In practice there are some overhead in several layers. 1Gbits/sec ethernet does not mean that much on the application side (but I guess at most 90% of that). A rule of thumb is to send or recv quite large data sizes (e.g. several kilobytes at least). Sending or recieving a few hundred bytes is inefficient. And the question is surely OS specific (I am thinking of Linux).

Recall that by definition TCP is not a transmission of packets, but of a stream of bytes. Read TCP wikipage. You should avoid send-ing or recv-ing a few bytes, or even a hundred of them. Try to send thousands of bytes at each time. Of course, a single recv on the recieving side is not (in general) corresponding to a single send on the emitter side and vice versa (especially if you have some routers between sending and recieving computers; routers can split or coalesce network packets, so you can't be sure to have one recv on the receptor per each send in the emitter).

Gigabit Ethernet wants Jumbo Frames of nearly 9000 bytes. You probably want your data buffer for send to be a little below that (because of the various overhead for IP and TCP), so try 8Kbytes.

The send(2) man page mentions MSG_MORE flag for tcp(7). You could use it with care. See also this.

Also syscalls(2) have some overhead. I'm surprised you are able to make a million of them each second. That overhead is another reason for buffering both outgoing and incoming data in significant pieces (of e.g. 8192, 16384, or 32768 bytes each; you need to benchmark to find the best one). And I won't be surprised if the kernel prefers page-aligned data. So perhaps try to have your buffer aligned to 4096 bytes (e.g. using mmap(2) or posix_memalign(3)...)

If you care about performance, don't use send(2) with a small byte count. At least change your application to send more than a few kilobytes (e.g. 4Kbytes) at each send syscall. And for recv(2), pass a buffer of at least 4kilobytes. So sending or recv a single byte or a line of a hundred bytes is inefficient. Your application should buffer such data (and perhaps split data into "application messages"...). There are some libraries doing that (like 0MQ...), or at least terminate each message with a delimiter (newline perhaps), which would ease the splitting of a received buffer into several incoming application messages.

^{My feeling is that your application is inefficient and buggy (probably would work badly on other networks, e.g. if there are some routers between both computers). You need to redesign and recode some parts of your application! You need to buffer, and you need to manage application messages - splitting and joining them ....}

Yous should test your application on several networks, in particular thru ADSL and wifi and if possible long-distance networking (you'll then observe that send and recv do not "match").

Yes, I want to send thousands of messages(only a char in request from client). And the overhead is larger than the message. `MSG_MORE` seems can make multiple message be send in a package. If yes, that would be what I want. — srain, Sep 06 '14 at 15:52
You should buffer more by sending less messages, but each having more bytes, if you care about performance. — Basile Starynkevitch, Sep 06 '14 at 15:58
I want a response for each request. Can I merge the multiple tcp headers into one? But the client / server still can receive them as separated? — srain, Sep 06 '14 at 16:03
You have to redesign your application with buffering, and that means that you'll could get more than one `recv` per application request (it is already the case now). With other protocols (like UDP) you should deal (in the application) with lost, truncated, or duplicate packets (and that is not easy). — Basile Starynkevitch, Sep 06 '14 at 16:06

score 2 · Answer 2 · answered Sep 06 '14 at 15:34

2

According to my math, you are relatively close to saturating the link.

As I understand it, this is one second of traffic.

Time              -------------traffic------------
Time               bytin  bytout   pktin  pktout
06/09/14-23:12:57  63.4M  155.3M  954.6K  954.6K

A TCP packet sent over Ethernet has 82 bytes of overhead (42 ethernet, 20 IP, 20 TCP), so the amount of data received is (954.6k * 80 + 63.4M)*8 bits, which totals 1.1G.

I would assume that with such a large number of packets, there would be additional overhead involved with negotiation of the physical medium. Since the links have about 50% utilization, if there's an additional delay as small as (1s / 954.6k) * 50% = 500 ns (one half microsecond!) then you've accounted for the additional delay. 500 ns is the amount of time it takes for light to travel 150 meters, which isn't that far.

answered Sep 06 '14 at 15:34

Dietrich Epp

205,541
37
345
415

24 ethernet. The `bytin` does not include the header size? – srain Sep 06 '14 at 15:38
1

Based on my assumptions, the packet overhead is larger than `bytin`, so that wouldn't be possible. However, it is possible that it counts only TCP overhead, or TCP and IP overhead, or something like that. I'm not familiar with `tsar`. – Dietrich Epp Sep 06 '14 at 15:40
2

@srain: The fact remains that you are trying to send and receive a million packets in one second. Send fewer packets with larger payloads if you want to saturate the link. – Dietrich Epp Sep 06 '14 at 15:42
Yes, the pay load is small when compared to the overhead. I want a response for each request. Can I merge multiple requests into a tcp package, and use `recv()` (or something else) in server, to receive them as separate requests? – srain Sep 06 '14 at 16:01
1

No, you can't, as said in my answer. TCP is a stream byte, without application level packets or messages. You have to buffer! – Basile Starynkevitch Sep 06 '14 at 16:04
2

@srain: TCP provides no guarantees that each `send()` on your client will match a `recv()` on your server. That's just not how TCP works: it's a stream of bytes. You could `send()` 1 byte 5 times, then `recv()` 5 bytes once, or vice versa. The only guarantee is that the number of bytes will match. – Dietrich Epp Sep 06 '14 at 16:10
The lockstep nature of the application described in the question makes me wonder whether OP is telling the truth about what he's doing. Each `send` will match up with a `recv` and vice-versa. – tmyklebu Sep 06 '14 at 16:17

How to utilize 100% of the network bandwidth with sockets?

Server

Client

Bandwidth usage

Question:

2 Answers2