1

I have written a program that constantly writes and reads from/to a USB serial port of a custom hardware. For each write there is always a response from the hardware.

Sometimes (separated by days), EAGAIN error is being returned for every call to write, no matter how many times I try to call it.

The only alternative for it to work again without rebooting is by issuing unbind and rebind commands to the USB port like:

# echo -n "1-3.2" > /sys/bus/usb/drivers/usb/unbind
# echo -n "1-3.2" > /sys/bus/usb/drivers/usb/bind

I wonder if it's a problem in the custom hardware, but I have no idea how to proof it, so I can ask the manufacturer to fix it.

What else could I check to guarantee that the problem is not on the code?

Here's how the port is opened:

int open_port(const char * portname)
{
    struct termios termAttr;
    struct stat st0;
    struct stat st1;
    int flock_ret;
    int fd;

    fd = open(portname, O_RDWR | O_NOCTTY | O_NDELAY);
    if (fd == -1)
    {
        printf("Can't open %s. %s", portname,
                strerror(errno));
    }
    else
    {
        fcntl(fd, F_SETFL, FNDELAY);
        tcgetattr(fd, &termAttr);
        cfsetispeed(&termAttr, B115200);
        cfsetospeed(&termAttr, B115200);

        termAttr.c_cflag |= (CLOCAL | CREAD | CS8);
        termAttr.c_iflag |= (IGNPAR | IGNBRK);

        termAttr.c_cc[VMIN] = 0;
        termAttr.c_cc[VTIME] = 0;

        tcflush(fd, TCIFLUSH);
        if (tcsetattr(fd, TCSANOW, &termAttr) == -1)
        {
            printf("%s", strerror(errno));
        }
    }

    return fd;
}

And here is how the writes are made:

void serial_write(int fd, char *comando)
{
    int cont = 0;
    int bytes_written;
    char ret[20];
    int lenght;
    char cmdfinal[200];

    if (!fd)
    {
        printf("** Error at %s: %s **", __func__, strerror(errno));
        return;
    }

    sprintf(cmdfinal, "%s\r", comando);

    lenght = strlen(cmdfinal);


    bytes_written = write(fd, cmdfinal, lenght);

    //Check for errors
    if (bytes_written == -1)
    {
        if (errno == EINTR || errno == EAGAIN || errno == EWOULDBLOCK)
        {
            //Wait so the next call have more chances to work
            sleep(5);
        }

        //Just print the error 
        printf("%s", strerror(errno));

        //Reopen the port
        serial_somlcd_close();
        serial_somlcd_open();
    }
    else
    {
        //Do things w/ response
    }

}
natenho
  • 5,231
  • 4
  • 27
  • 52
  • *"What else could I check to guarantee that the problem is not on the code?"* -- IMO that's the wrong question; your code is less than ideal. A **tcgetattr()** followed by **bzero()** is illogical. See [Setting Terminal Modes Properly](http://www.chemie.fu-berlin.de/chemnet/use/info/libc/libc_12.html#SEC237). Setting both **VMIN** and **VTIME** to zero is questionable, although you haven't shown the read logic. Short writes are not properly handled properly. The root problem is probably use of nonblocking I/O, which you don't seem to need. – sawdust May 12 '16 at 17:41
  • Yes, bzero is wrong, someone let some garbage on the way =p. The non-blocking purpose is basically to avoid that such problem could block indefinitely. In other words I don't want the code to stop even if the serial port have problems to be read/written. VMIN and VTIME only affect reads, right? I don't think reads are relevant in this case... – natenho May 12 '16 at 18:39
  • I couldn't understand _"Short writes are not properly handled properly."_ – natenho May 12 '16 at 18:40
  • When `bytes_written != lenght` [sic], you print out a stale value of **errno**. The **errno** is only set when the return code is -1. Which USB-serial adapter are you using? – sawdust May 12 '16 at 19:13
  • Yes, that's right, it's a redundant block, the printf could be inside the previous if block, i've made the changes you suggested. Regarding the adapter, It's a proprietary hardware, connected via USB but communication is over serial port. Linux detect it as a ttyACMx dev. – natenho May 12 '16 at 20:17
  • You broke the code by simply removing the **bcopy()**; it was illogical (get values but then zero them out), but did serve a purpose. In its place you should call **cfmakeraw()**. Your original code does not bother to touch HW flow-control settings. See http://stackoverflow.com/questions/12437593/how-to-read-a-binary-data-over-serial-terminal-in-c-program/12457195#12457195 – sawdust May 12 '16 at 20:49

0 Answers0