Read packet of chars into an array and then parse

Question

So to start off I apologize for being a newb at this. I've tried to work through this on my own and search out other answers but I'm not even really sure what exactly to search for. I found this to be a helpful answer but just hoping for some more clarification.

I'm working with a couple dev boards for work trying to get them to talk to each other. The one board I need to talk to gave some sample code I've been trying to work with. The TX function works fine, I'm just not familiar enough with C to understand fully what I'm doing.

The RX function they provide is:

int recv_packet(char *p, int len){
    char c;
    int received = 0;
    /* sit in a loop reading bytes until we put together
    * a whole packet.
    * Make sure not to copy them into the packet if we
    * run out of room.
    */
    while(1) {
        /* get a character to process
        */
        c = UART_GetChar();
        /* handle bytestuffing if necessary
        */
        switch(c) {
            /* if it's an END character then we're done with
            * the packet
            */
            case END:
                /* a minor optimization: if there is no
                * data in the packet, ignore it. This is
                * meant to avoid bothering IP with all
                * the empty packets generated by the
                * duplicate END characters which are in
                * turn sent to try to detect line noise.
                */
                if(received)
                    return received;
                else
                    break;
            /* if it's the same code as an ESC character, wait
            * and get another character and then figure out
            * what to store in the packet based on that.
            */
            case ESC:
                c = UART_GetChar();
                /* if "c" is not one of these two, then we
                * have a protocol violation. The best bet
                * seems to be to leave the byte alone and
                * just stuff it into the packet
                */
                switch(c) {
                    case ESC_END:
                        c = END;
                        break;
                    case ESC_ESC:
                        c = ESC;
                        break;
                }
            /* here we fall into the default handler and let
            * it store the character for us
            */
            default:
                if(received < len)
                p[received++] = c;
        }
    }
}

then based on the answer I found I'm able to call it with a function like

int main() {
    char arr[10] = {0};
    recv_packet(arr, 10);
    /*then parse it somehow-- 
    * I'll figure this out on my own, 
    * but for now I just want to read it all 
    * into an array.
    */
    parse_function(arr);
}

So if you made it through all that.... my big question is how do I prevent my array from filling up if I make the length smaller than the message I need to receive? the device I'm using C to talk to will send back a series of hex characters starting and ending with 0xC0 (otherwise defined in other code as 'END') but the middle is a whole mess of hex that is a response back from the device depending on what I send in send_packet. I guess I could make arr very large just in case but I want to know what proper coding would dictate in this scenario.

As it currently is, any bytes received beyond the size of the buffer are not saved. Are you looking to increase the size of the buffer as needed for whatever you might receive? — dbush, Aug 31 '18 at 16:58
You can dynamically resize the buffer. Doubling the size every time you reach the maximum size and copying over the old values — Mitch, Aug 31 '18 at 17:02
yes, I want to receive everything between the 'END' characters (i.e. a whole message is **C0........C0** and if it's **C0......** then something didn't work right. It would probably be a good idea to have a max so it doesn't go forever but if I'm only receiving 30 chars it seems like a waste to declare a `char arr[80]`. — BrownKuma, Aug 31 '18 at 17:04
@BrownKuma Memory is cheap. The difference between 30 and 80 is negligible. If you know the message will be no more than 1000 bytes or so, there's no problem declaring a local array of that size and just using that. — dbush, Aug 31 '18 at 17:10
@dbush sorry bad example. It would be more of a waste if I make `char arr[1000]` but only receive 30 chars. Also I'm working with a PSoC FPGA so that's not necessarily the case as it's only got 64KB of ram (which also need to run all the other stuff we're having it do). For what I'm doing initially it shouldn't receive more than 30-50 chars just giving confirms to my commands but it's a modem so eventually, once I get further along in this development, it's going to be relaying messages it is receiving which could be very large. — BrownKuma, Aug 31 '18 at 17:35
@MitchelPaulin That seems like it could be a step in the right direction but I am not even sure how to approach that. I was under the impression that C array lengths were static so I'm not sure how I would return back to my array to then parse it. Do you have an example you can give or a link to an example? — BrownKuma, Aug 31 '18 at 17:37
Essentially you will need to make an abstraction of an array, similar to ArrayLists in other languages. The SO post should help https://stackoverflow.com/questions/7221981/how-to-make-a-dynamic-sized-array-in-c — Mitch, Aug 31 '18 at 17:47
Dynamic arrays are not a perfect idea in small systems with little ram (and perhaps little power). — linuxfan says Reinstate Monica, Aug 31 '18 at 17:54

linuxfan says Reinstate Monica · Accepted Answer · 2018-09-01T07:20:46.633

I say that you need more analysis - 2 steps. Step one: define (or respect) the protocol you intend to use. The protocol must define a maximum packet length, and no larger packet would be legal. The routine you posted is already protected from too much large packets, but there is no check on validity. The protocol could define a fixed length for packet, or different lengths for different types of packets, or whatever, and you must check them.

Step two: get one packet at a time, manage it, then start over again.

There are two techniques which can help in the simple situation above. If you need to transfer big quantities of data, use more than one packet for a (big) chunk of logical data; for example, if you receive data and store them somewhere else, for example a flash memory, it's no good to receive all the data and then store all the data. Do it a piece at a time, perhaps together with a consistency check.

Second suggestion: if you are worried you can lose a packet while managing another one, you must use interrupts (may be you are already doing that). So, you can receive while you manipulate data. Sometimes, a double buffer can help; receive a packet, put it in a secondary (double) buffer, start again to receive while you analyze the secondary buffer. This can even be made directly in the interrupt routine (switching buffers instead of copying between them), but this is probably out of topic.

-- UPDATE AFTER COMMENT BELOW --

Actually, when data transmission is involved, processing incoming data while receiving it is the most efficient (and probably secure) way. You must anyway receive and parse (two things to do), so the code does not get bigger. You don't waste time while waiting for data, and you don't waste unneeded ram. At least, some low-level part of the protocol should be done while receiving (the routine you posted does something with escapes). As you mentioned, while receiving you can know, sometimes, how much data will come after.

The packet has a CRC 2 byte chunk added to the end before the **END** char (0xC0) so that is the check for validity and the parser will go down the rabbit hole of the message structuring. There is more than likely a max length for confirmation messages but none that I'm aware of from reading through their datasheet for _actual_ messages that it's relaying (it's a modem). — BrownKuma, Aug 31 '18 at 18:16
I did have a thought along the lines of "step two" where instead of having an array that might not be the right size as a buffer receive the message then pass it to a parser, to instead build out this function to parse as it receives. Part of the messaging structure of the device is to tell you what to expect and even how long (i.e. END char, device id, message protocol, message format, message length, message, CRC, END char). I think I should probably just avoid the headache and do it right from the beginning and pursue that route. Thanks! — BrownKuma, Aug 31 '18 at 18:23
@BrownKuma this is the perfect idea, when possible! Don't store-then-parse, but parse while receiving!. This point should be added in my reply. Tomorrow. Best wishes and goodnight... — linuxfan says Reinstate Monica, Aug 31 '18 at 18:27

score -1 · Answer 2 · answered Aug 31 '18 at 17:18

-1

I guess I could make arr very large just in case but I want to know what proper coding would dictate in this scenario.

That is correct - when reading from I/O, whether it be network, filesystem, etc, it is common to use a buffer, and to make it larger than what you expect your incoming message size to be.

So instead of char arr[10] = {0};, you could do

char arr[10000];

and just reuse it and not worry about running out of room.

answered Aug 31 '18 at 17:18

vasia

1,093
7
18

I find this answer horrible. The correct way to go is not to allocate arbitrary large buffers, which can be *always* overrunned, but to *limit* the data to be written in the buffer. I would give -2, if I could. – linuxfan says Reinstate Monica Aug 31 '18 at 17:37
@vasia I am aware I could do that but I'm hoping to avoid arbitrarily allocating a lot of memory to something that a majority of the time will be much smaller. – BrownKuma Aug 31 '18 at 17:55
@linuxfan Thank you. As I said in the comments to my question I'm working with limited memory so I am hoping to be able to somehow dynamically resize the array to be roughly the size of the packet received. Do you have a suggestion of how I could do that? – BrownKuma Aug 31 '18 at 17:55
@BrownKuma in embedded world, it is normal to allocate a buffer just big enough, and always use that, perhaps in different ways (the same buffer can be used for two different non-concurrent jobs). I don't understand how much your packets are variable. If their length is max 1024 bytes, just allocate a buffer of 1024; it is not a waste to use it for smaller packets. But if your packets sometimes are 32K, then you are low in ram. Anyway, you *can* use malloc() for dynamic memory, probably. But you must know beforehand that you will never need more than your 64K (dynamic ram costs ram!) – linuxfan says Reinstate Monica Aug 31 '18 at 18:15
@BrownKuma I didn't know that this is running on an embedded device, and that you need to be conservative with memory. I don't believe the comments that describe the memory constraints were up when I answered the question.. – vasia Sep 01 '18 at 01:40
@linuxfan The asker mentioned that they were a newb, so I gave them the simplest possible answer. If I needed to be conservative with memory I would do something similar to what you described in your comment. Also, if the buffer size is larger than the largest possible message, then the buffer will never overflow... So it is false that the buffer can be *always* overrunned (overflowed). – vasia Sep 01 '18 at 01:43
@vasia buffer overflows are the most common hacker attack, and they can also happen because of a transmission error (transmissions are *not* 100% reliable). No no, definitely. Buffers must be large enough, but protected from overrun. Stop. – linuxfan says Reinstate Monica Sep 01 '18 at 07:14

Read packet of chars into an array and then parse

2 Answers2