0

I am receiving buffer that contains UTF-16 BE encoded text like this:

uint8_t rx_data[length] = {0x04, 0x24, 0x04, 0x30, 0x04, 0x3C, 0x04, 0x38, 0x04, 0x3B,
            0x04, 0x38, 0x04, 0x4F, 0x00, 0x0A, 0x04, 0x18, 0x04, 0x3C, 0x04, 0x4F,
            0x00, 0x0A, 0x04, 0x1E, 0x04, 0x42, 0x04, 0x47, 0x04, 0x35, 0x04, 0x41,
            0x04, 0x42, 0x04, 0x32, 0x04, 0x3E}

Buffer contains three text strings that are separated with "\n" or {0x00, 0x0A} in my buffer. How can I split this text into strings by new line so I will get something like this:

uint8_t str1[] = {0x04, 0x24, 0x04, 0x30, 0x04, 0x3C, 0x04, 0x38, 0x04, 0x3B,
                0x04, 0x38, 0x04, 0x4F}
uint8_t str2[] = {0x04, 0x18, 0x04, 0x3C, 0x04, 0x4F}
uint8_t str3[] = {0x04, 0x18, 0x04, 0x3C, 0x04, 0x4F,
                0x00, 0x0A, 0x04, 0x1E, 0x04, 0x42, 0x04, 0x47, 0x04, 0x35, 0x04, 0x41,
                0x04, 0x42, 0x04, 0x32, 0x04, 0x3E}

I am considering to somehow transform my array into u16string or wstring from standard library so that I can do with transformed string smth like this:

std::wstring s_rx_data = "string1/nstring2/nstring3";
std::wstring delimiter = "\n";

size_t pos = 0;
std::string token;
while ((pos = s_rx_data.find(delimiter)) != std::string::npos) {
    token = s_rx_data.substr(0, pos);
    std::cout << token << std::endl;
    s_rx_data.erase(0, pos + delimiter.length());
}
std::cout << s_rx_data << std::endl;

And then convert it back to 3 arrays with bytes. The question is, how can I transform my buffer into c++ string? Or may be better to use more strict way to divide this buffer? Like just search in a loop for delimiter and then copy all the symbols before the delimiter to new buffer.

P.S. All this happens on STM32 MCU, so I have not really big computing resources. I am receiving this buffer via Ethernet and have to separate it and print via UART on LCD screen that supports only UTF-16BE. I have combined C/C++ project, so I can use either C or C++ approaches.

  • 1
    The linefeed is a sequence of `0x00, 0x0A`. I'm not sure that `std::wstring` supports UTF-16 BE on your platform. If it's only a question of finding the separators then the conversion to `std::wstring` is IMHO not necessary. – Scheff's Cat Jun 10 '20 at 14:08
  • What STM32 is this? What compiler are you using? – KamilCuk Jun 10 '20 at 14:24
  • @KamilCuk MCU is STM32H753, I am running FreeRTOS on it I am Using Cube IDE that has GCC compiler – Leonid Lianiou Jun 10 '20 at 14:35
  • So why not just 1. find the sequence `0x00 '\n'` in the buffer. Then 2. Push back up until that sequence into a `std::vector>` vector to store the lines. 3. Repeat. ? – KamilCuk Jun 10 '20 at 14:50
  • @KamilCuk I will try it this way, thank you! – Leonid Lianiou Jun 10 '20 at 15:01

1 Answers1

1
std::wstring_convert<std::codecvt<char16_t,char,std::mbstate_t>,char16_t>convert;
std::u16string u16 = convert.from_bytes(rx_data);

And here is many examples of splitting.