3

In a special application in which our server needs to update firmware of low-on-resource sensor/tracking devices we encountered a problem in which sometimes data is lost in the remote devices (clients) receiving packets of the new firmware. The connection is TCP/IP over GPRS network. The devices use SIM900 GSM chip as a network interface.

The problems possibly come because of the device receiving too much data. We tried reducing the traffic by sending packages more rarely but sometimes the error still occured.

We contacted the local retailer of the SIM900 chip who is also responsible for giving technical support and possibly contacting the chinese manufacturer (simcom) of the chip. They said that at first we should try to reduce the TCP MSS (Maximum Segment Size) of our connection.

In our server I did the following:

static int
create_master_socket(unsigned short master_port) {

    static struct sockaddr_in master_address;
    int master_socket = socket(AF_INET,SOCK_STREAM,0);
    if(!master_socket) {
            perror("socket");
            throw runtime_error("Failed to create master socket.");
    }

    int tr=1;
    if(setsockopt(master_socket,SOL_SOCKET,SO_REUSEADDR,&tr,sizeof(int))==-1) {
            perror("setsockopt");
            throw runtime_error("Failed to set SO_REUSEADDR on master socket");
    }

    master_address.sin_family = AF_INET;
    master_address.sin_addr.s_addr = INADDR_ANY;
    master_address.sin_port = htons(master_port);
    uint16_t tcp_maxseg;
    socklen_t tcp_maxseg_len = sizeof(tcp_maxseg);
    if(getsockopt(master_socket, IPPROTO_TCP, TCP_MAXSEG, &tcp_maxseg, &tcp_maxseg_len)) {
            log_error << "Failed to get TCP_MAXSEG for master socket. Reason: " << errno;
            perror("getsockopt");
    } else {
            log_info << "TCP_MAXSEG: " << tcp_maxseg;
    }
    tcp_maxseg = 256;
    if(setsockopt(master_socket, IPPROTO_TCP, TCP_MAXSEG, &tcp_maxseg, tcp_maxseg_len)) {
            log_error << "Failed to set TCP_MAXSEG for master socket. Reason: " << errno;
            perror("setsockopt");
    } else {
            log_info << "TCP_MAXSEG: " << tcp_maxseg;
    }
    if(getsockopt(master_socket, IPPROTO_TCP, TCP_MAXSEG, &tcp_maxseg, &tcp_maxseg_len)) {
            log_error << "Failed to get TCP_MAXSEG for master socket. Reason: " << errno;
            perror("getsockopt");
    } else {
            log_info << "TCP_MAXSEG: " << tcp_maxseg;
    }
    if(bind(master_socket, (struct sockaddr*)&master_address,
                            sizeof(master_address))) {
            perror("bind");
            close(master_socket);
            throw runtime_error("Failed to bind master_socket to port");

    }

    return master_socket;
}

Running the above code results in:

I0807 ... main.cpp:267] TCP_MAXSEG: 536
E0807 ... main.cpp:271] Failed to set TCP_MAXSEG for master socket. Reason: 22 setsockopt: Invalid argument
I0807 ... main.cpp:280] TCP_MAXSEG: 536

As you may see, the problem in the second line of the output: setsockopt returns "Invalid argument".

Why does this happen? I read about some constraints in setting TCP_MAXSEG but I did not encounter any report on such a behaviour as this.

Thanks, Dennis

dennis90
  • 239
  • 1
  • 4
  • 12
  • It sounds to me like everybody is guessing here. The device should handle a standard MTU and MSS correctly: if it doesn't, or more likely if it has some other TCP bug, they should *fix* it. – user207421 Aug 07 '13 at 21:41

2 Answers2

7

In addition to xaxxon's answer, just wanted to note my experience with trying to force my Linux to send only maximum TCP segments of a certain size (lower than what they normally are):

  • The easiest way I found to do so, was to use iptables:

sudo iptables -A INPUT -p tcp --tcp-flags SYN,RST SYN --destination 1.1.1.1 -j TCPMSS --set-mss 200

This overwrites the remote incoming SYN/ACK packet on an outbound connection, and forces the MSS to a specific value.

Note1: You do not see this in wireshark, since wireshark capture before this happens.

Note 2: Iptables does not allow you to -increase- the MSS, just lower it

  • Alternatively, I also tried setting the socket option TCP_MAXSEG, like dennis had done. After taking the fix from xaxxon, this also worked.

Note: You should read the MSS value after the connection has been set up. Otherwise it returns the default value, which put me (and dennis) on the wrong track.

Now finally, I also ran into a number of other things:

  • I ran into TCP-offloading issues, where despite my MSS being set correctly, the frames being sent were still shown by wireshark as too big. You can disable this feature by : sudo ethtool -K eth0 tx off sg off tso off. This took me a long time to figure out.

  • TCP has lots of fancy things like MTU path discovery, which actually try to dynamically increase the MSS. Fun and cool, but confusing obviously. I did not have issues with it though in my tests

Hope this helps someone trying to do the same thing one day.

Arnout
  • 341
  • 2
  • 8
1

Unless otherwise noted, optval is a pointer to an int.

but you're using a u_int16. I don't see anything saying that this parameter isn't an int.

edit: Yeah, here is the source code and you can see:

637         if (optlen < sizeof(int))
638                 return -EINVAL;
xaxxon
  • 19,189
  • 5
  • 50
  • 80
  • This is it, thank you! Unfortunately the core of the problem still exists, just slightly in an other form: Now setsockopt returns 0, but the value doesn't change. Now I am up to some more research! – dennis90 Aug 08 '13 at 07:07
  • @dennis90 have you tried disabling nagle algorithm and calling with smaller sends? I guess that doesn't really guarantee anything.. but it would probably work. – xaxxon Aug 08 '13 at 08:11
  • Does disabling nagle algorithm means simply setting TCP_NODELAY on the sockets returned by the accept() call of the server? Then indeed I tried but it didn't work. I will try again now, though - maybe I did something wrong. – dennis90 Aug 08 '13 at 08:50
  • yeah.. basically the OS will try to send data as soon as you give it, instead of waiting a bit to see if you'll give more. So if you send small bits, it will **likely** send it right away.. so send small bits – xaxxon Aug 08 '13 at 09:06
  • Last time I tried it I definitely made some mistakes (probably because of being tired...) So now with disabled nagle algorithm, and a careful, timed sender thread in the server, everything works fine! I got 1 broken package out of 2932... (And even that one I could detect) Thank you! – dennis90 Aug 08 '13 at 12:39
  • @dennis90 yay! While this is tacky and the MSS solution would be more ideal.. sometimes the one that works is better. PLEASE, however, comment the details as to why you've gone with this solution in the code for the guy who has to work on this next and would otherwise think that you're a complete moron. – xaxxon Aug 08 '13 at 18:02
  • Ok. (though this comment box is very tiny...) We had the SIM900 communicating with our CPU using UART, and we had an interrupt-driven UART driver making a link between SIM900 and our modem driver code. When a package of say 538 bytes was received, the modem wrote "AT+IPD,538:" on UART, and after there came 538 bytes of data... ideally... But when we were sending such big packages, 1 out of 10 times there wasn't 538 bytes coming but only 533 or 535, etc... We set up a character counter in our interrupt handler that counted characters after recognizing an IPD message... – dennis90 Aug 10 '13 at 08:54
  • ... until it was cleared by the modem driver and we realized, that these some missing bytes actually never come out of the modem. And the more stress the modem got (by for example sending data at the same time of reception), the more packages it broke. We contacted the local retailer and he adviced the reduction of tcp maximum segment size. But as it turned out in this SO thread, I couldn't reduce MSS on the server. As xaxxon advised, I turned on TCP_NODELAY in the server and wrap the socket write calls with a code that broke big packages into smaller pieces and sent them in a timed fashion. – dennis90 Aug 10 '13 at 08:57
  • As TCP is a streaming protocol, it doesn't really matter how you split up your data, though it is not common to do this splitting yourself. Also, I am not sure that this method does real splitting and my server's network stack never merges two consecutive packages while sending... – dennis90 Aug 10 '13 at 08:59
  • 1
    @dennis90 nice job.. but what I meant was to put the details in your source code. – xaxxon Aug 10 '13 at 22:27