0

I am trying to read messages sent to my application via serial port. I am doing this successfully for the most part. My function assumes that there will be characters being read from the serial port which is where my concern comes in. Sometimes there is unknown characters or raw data being read into a character array. Before moving forward with the serial port message, I wanted to check the array to make sure there isn’t raw data or unknown characters. I have added the below check but was wondering if there was a better way to check:

/// \return -1 for success or returns index of failed check
int serial_error_check (char *buffer, int bufLen)
{
    for (int i = 0; i < bufLen; i++)
    {
        // 10 is LF, 13 is CR, all other characters fall between 32 and 126
        if (buffer[i] != 10 && buffer[i] != 13)
        {
            if (buffer[i] < 32 || buffer[i] > 126)
            {
                return i;
            }
       }
    }
    return -1;
}

C code on Linux OS.

EDIT: I tested this code by changing the baud rate on the serial port so that it didn't match with the receive end. I sent data across which caused for unknown characters to enter the buffer. The function returned but caused one of my processes to crash. So to refocus my question: how should I check each character index to make sure the buffer isn't corrupted? side notes: The connection is to a piece of hardware that the firmware has already been written so handshaking or checksum probably won't work. the buffer size is char[128] and I get the size of the message sent but that number is size of 1 character.

  • 2
    Framing bytes, headers, length fields, and checksums are some common techniques. – JohnFilleau Mar 21 '22 at 02:35
  • Better can mean a lot of things. I don't see anything obviously wrong with the code you have. It looks like you are checking that the data characters are between 1 and 126, any other values are invalid. Is that what you want? Why do you think there might be a better way? – Stuart Mar 21 '22 at 03:06
  • I haven’t done any serial port work before now so I am not sure if there is some nuances that I may have missed. One thing I worry about is some sort of error or issue when checking a character to see what number it is. Will it even have a number if the character is unknown? I see the diamond with the question mark in it a lot – Timothy Harmon Mar 21 '22 at 03:10
  • 2
    Say it out loud: _"value is less than 1 AND greater than 126"_ Keep saying it out loud while you try to think of any number that satisfies such a condition. – paddy Mar 21 '22 at 03:15
  • 1
    [What type of framing to use in serial communication](https://stackoverflow.com/q/17073302/2410359) may help. – chux - Reinstate Monica Mar 21 '22 at 04:26
  • I'd be asking myself why there are "unknown characters and raw data" being received. Are you sure they aren't correct characters with the parity bit set? In which case you will want to mask off the top bit ( `&=0x7f` ) before storing. Also as paddy mentioned, your test is wrong - should be `||` rather than `&&`. – Dipstick Mar 21 '22 at 08:23
  • The comment from @paddy is hinting at a fix you should probably make to your code. Basically, you need to change the && to || because a character can't be less then 1 and greater than 126 at the same time. If the data you are reading is ASCII data (you can do an internet search on "ASCII table" for more information), you can change the expression in the if statement to buf[i] < 32 || buf[i] > 126. This will filter out the non-printable characters. Good luck! – Stuart Mar 21 '22 at 08:33
  • I wrote the original post on my phone and didn't have the code in front of me hence the logically error. I also updated/edited the question to address an actual issue when testing the code today. My real function checks for 32-126 characters but also for LF/CR character numbers. – Timothy Harmon Mar 21 '22 at 20:30
  • Implement a protocol for your messages, with a header, with checksum. The simplest that I know of is https://en.wikipedia.org/wiki/NMEA_0183 . `$` is the header, `\n` ends a message, after `*` there's a checksum – KamilCuk Mar 22 '22 at 02:24
  • This reads like a [XY problem](https://xyproblem.info/). You claim to be reading a serial port, but no line setting are mentioned other than baudrate. You're using a buffer of type char, but no mention of 7- or 8-bit long data (nor UART frame size). You seem to have the mistaken impression that there's a difference between *"characters'* versus *"raw data"*; the only difference is interpretation of the values. Now if you mean only *printable* ASCII characters, then you need to explicitly specify that. – sawdust Mar 25 '22 at 05:31
  • *"how should I check each character index to make sure the buffer isn't corrupted?"* -- You can't. Testing individual byes of a message will not reliably detect any message corruption. That essentially requires that you already know what the message should be. You need something added to the message contents/payload, such as a CRC and maybe message framing, to validate the message integrity. – sawdust Mar 25 '22 at 05:41

0 Answers0