2

I am facing a very strange problem, which I wasn't able to solve. I want to read (just read) data collected and sent by a micro-controller via usb as serial port (FTDI) on Mac Os X using c++. The the size of one complete data-sequence is always exactly 10 bytes. However I was using the following code to read the data:

Imported libraries:

#include <iostream>
#include <fstream>
#include <unistd.h>
#include <fcntl.h>
#include <termios.h>
#include <sys/ioctl.h>

Code:

void init(){
    serial = open(port.c_str(), O_RDWR | O_NOCTTY | O_NONBLOCK); // 0_RDONLY ?
    struct termios options;
    //set opt to 115200-8n1
    cfsetspeed(&options, B115200);
    options.c_cflag &= ~PARENB;
    options.c_cflag &= ~CSTOPB;
    options.c_cflag &= ~CSIZE;
    options.c_cflag |= CS8;

    tcsetattr(serial, TCSANOW, &options);
    if (serial < 0){
        //Error
    }else{
        //run loop
    }
}

void serial_loop(){
    long bytes_read;
    int bytes_available;
    unsigned char msg[10];
    while(1){
        do{
            usleep(1000);
            ioctl(serial, FIONREAD, &bytes_available);

        }while(bytes_available < 10); //wait for the sequence to complete

        bytes_read = read(serial, msg, 10);

        //do some parsing here
    }
}

This code worked a few days ago but now it isn't anymore. The data is reaching the computer perfectly according to the Terminal -> screen -command. I checked the port-file-name which is still correct and the port is opened successfully as well. I narrowed down my issue to the ioctl-command FIONREAD which doesn't write the correct number to the bytes_available-var (anymore). It did work, and I believe, I didn't change anything in the code.

Do you see any problem that could cause this issue? Are there any dangerous passages in my code? Thank you for your help, I'm really stuck here...

EDIT: Thanks to the feedback, I was able to get it running again. Here is the current code:

int serial;
void init(){
    serial = open(port.c_str(), O_RDWR | O_NOCTTY); //removed 0_NONBLOCK
    struct termios options;
    //set opt to 115200-8n1
    cfsetspeed(&options, B115200);
    options.c_cflag &= ~PARENB;
    options.c_cflag &= ~CSTOPB;
    options.c_cflag &= ~CSIZE;
    options.c_cflag |= CS8;

    options.c_lflag &= ~(ICANON | ECHO | ECHOE | ISIG); //Non-canonical
    options.c_cc[VMIN]     = 1; //block read until at least 1 byte was recieved
    options.c_lflag = 0;

    tcsetattr(serial, TCSANOW, &options);
    if (serial < 0){
        //Error
    }else{
        //run loop
    }
}

void serial_loop(){
    int datalength = 10;
    long bytes_read = 0;
    int bytes_in_msg = 0;
    unsigned char buf[datalength];
    unsigned char msg[datalength];
    do{
        bytes_read = read(serial, buf, datalength-bytes_in_msg);
        usleep(1000);
        if (bytes_read>0){
            memcpy(&msg[bytes_in_msg], &buf, bytes_read);
        }
        bytes_in_msg += bytes_read;
    }while(bytes_in_msg < datalength);

    //do some parsing here
    }
}

This works, but is there anything left, that could be problematic?

Thank you for your support!

archimedes
  • 87
  • 1
  • 8
  • Looks like you have a grounding problem in your device. Try to check it – vadikrobot Aug 30 '16 at 18:16
  • No, the device still works, and if I use the screen-command (commandline) or CoolTerm I can read the data without errors. Is there a way of resetting all settings for the port? – archimedes Aug 30 '16 at 18:40
  • *"This code worked a few days ago but now it isn't anymore."* -- That usually indicates improper or incomplete initialization. Your termios initialization only configures the baudrate and character size, and everything else is left to chance. See [Setting Terminal Modes Properly](http://www.chemie.fu-berlin.de/chemnet/use/info/libc/libc_12.html#SEC237) and [Serial Programming Guide for POSIX Operating Systems](http://www.cmrr.umn.edu/~strupp/serial.html) – sawdust Aug 30 '16 at 23:52
  • *"FIONREAD which doesn't write the correct number to the bytes_available-var (anymore)."* -- Negative descriptions, i.e. what does not occur, are not as helpful or specific as descriptions of what does occur. So what kind of values are you getting back? Why do you think it is *"wrong"*? Why aren't you checking the return code from each syscall, esp. this `ioctl()`? Where is the variable `serial` declared? – sawdust Aug 31 '16 at 00:10
  • Thank you for your feedback! I've set some more options to the termios-options. Are there any other important options I have to set to ensure a working communication (with an Arduino)? – archimedes Aug 31 '16 at 09:39

2 Answers2

3

This code worked a few days ago but now it isn't anymore.

Fickle program behavior usually indicates improper or incomplete initialization.
Your termios initialization only configures the baudrate and character framing, and everything else is left to chance.
See Setting Terminal Modes Properly and Serial Programming Guide for POSIX Operating Systems.


Your revised code still has not properly resolved this issue.
The code never initializes the termios structure options by calling the tcgetattr() function.
For sample code see my answer to How to open, read, and write from serial port in C?


options.c_lflag = 0;

That is not considered the proper way of assigning a termios element.


options.c_cc[VMIN]     = 1;

Non-canonical mode requires definition of both the VMIN and VTIME entries.
Your code uses whatever garbage value that exists at the location for VTIME.
See Linux Blocking vs. non Blocking Serial Read.


FIONREAD which doesn't write the correct number to the bytes_available-var (anymore).

Negative descriptions, i.e. what does not occur, are not as helpful or specific as descriptions of what does occur.
So what kind of values are you getting back?
Why do you think it is "wrong"?

More importantly, why aren't you checking the return value from each syscall, especially this ioctl() that you think is giving you problems?

Most likely is that the ioctl() failed, did not update your bytes_available variable, and returned an error code.
Instead of first checking for a good return, your code unconditionally uses the returned argument.


Another answer that critiques your code is misleading. Flow control can be disabled. Your code is broken because it doesn't do proper initialization and is not checking for error returns, not because the comm link is "deadlocked".


In your revised code:

    bytes_read = read(serial, buf, datalength-bytes_in_msg);
    usleep(1000);

A sleep or delay before and/or after a blocking read is superfluous.

sawdust
  • 16,103
  • 3
  • 40
  • 50
  • Thank you for the thorough answer. I would've voted it, if I could. Your answer on the other question you linked was really helpful setting the correct parameters / flags and of course initializing the termios options properly (as you said by calling `tcgetattr()`). – archimedes Sep 01 '16 at 20:24
  • I did check the return-values when tracking down the original issue, that's why I was able to recognize, that `ioctl()` wasn't failing (it did return 0, which means success according to the documentation I found) but setting the "wrong" value to the variable. As I said before, I am expecting 10 bytes in each sequence, but `ioctl()` gave me 2 (after the first sequence), 4 (after the second) then 6 and so on. – archimedes Sep 01 '16 at 20:24
  • *"I did check the return-values..."* -- We can only review the code that is actually posted. You provided no other details other than you concluded that the data was *"wrong"*. [Describe the problem's symptoms, not your guesses](http://www.catb.org/esr/faqs/smart-questions.html#symptoms). Given that you were submitting garbage to **tcsetattr()**, all of your previous test results are not relevant anymore. Fix the bugs in your program as you find them, and then restart testing all over again. – sawdust Sep 02 '16 at 00:25
  • Right now, I am using the new way of reading the data based on @DavidSchwartz hints. I postet the code in my original question. Since it doesn't need the ioctl-command at all, I cannot say, if the error still occurs, but at this moment I am quite sure, that it was caused by an incomplete initialization as you said, because the error was gone after initializing the port more accurate as you proposed. – archimedes Sep 02 '16 at 14:23
0

Your code is broken. No protocol can allow both the reader to wait for the writer and the writer to wait for the reader. If it did, deadlock could result.

Your do/while loop refuses to read any data until the writer writes all 10. However, serial ports permit the writer to refuse to write any more data until the reader reads what it has already written. So it cannot also permit a reader to refuse to read any more data until the writer writes more.

You cannot use FIONREAD to wait for more bytes to be written because the writer may also be waiting for you. Instead, read the bytes as they become available, thus ensuring you unblock the writer. Accumulate them in a buffer and break when you have the number you need.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
  • Thank you. I was afraid of loosing bytes when reading the message not all at once. But I changed the code as you said (_"Accumulate them in a buffer and break when you have the number you need."_) and that works great without loosing anything till now. – archimedes Aug 31 '16 at 10:00
  • *"serial ports permit the writer to refuse to write any more data until the reader reads what it has already written."* -- **Flow control** can be disabled to make your critique meaningless. – sawdust Aug 31 '16 at 20:16
  • @sawdust Huh? That doesn't even make sense. Even if you disable flow control, it will still be the case that serial ports permit the writer to refuse to write any more data until the reader reads what it has already written because disabling flow control doesn't make it cease to exist in the universe. The serial port protocol could either allow writers to wait for readers or allow readers to wait for writers. It cannot allow both. It chose to allow writers to wait for readers to make flow control *possible*. Disabling it doesn't change this. – David Schwartz Aug 31 '16 at 21:31
  • *"Huh? That doesn't even make sense."* -- That's exactly my reaction to your "answer". Nothing more than RS-232 seems to be inferred in this question. Exactly what is this (additional) *"serial port protocol"* and/or serial port "capability" (that *"permit the writer to refuse to write any more data"*) that you allege to exist in this problem? There is absolutely nothing (other than HW flow control) that a receiving UART can do to inhibit the transmitting UART (in a full-duplex link). That's why there's (cooperative) flow control, rcivr & buffer overrun checks, and/or message protocols. – sawdust Sep 01 '16 at 00:15
  • @sawdust Do you agree with the following: "Either a protocol can allow the writer to wait for the reader or it can allow the reader to wait for the writer, but it cannot allow both, because that could result in deadlock." If you agree, which of those two options do you think applies here? – David Schwartz Sep 01 '16 at 01:02
  • Again, what "protocol"? There's nothing in place to constrain the transmitting UART (on the Aduino), which in turn fills the receive buffer in question. Instead of asking rhetorical questions, why don't you answer the questions I posed to you to expand on your vague concepts? Also, you're ignoring the fact that this *"broken"* code initially "worked a few days ago"*. Your "answer" doesn't account for how this *"broken"* code could also occasionally work. – sawdust Sep 01 '16 at 01:59
  • @sawdust The broken code can occasionally work because he's violating the rules that assure that there's no deadlock, thus he has no assurance that it won't deadlock. So it *can* deadlock. I don't know what you mean by "what protocol". He is communicating between two devices, the way he's doing that is called a "protocol". Are you saying he's doing it without any method of doing so? – David Schwartz Sep 01 '16 at 16:51