1

I am trying to use StringBuilder to create the output that is being sent over the serial port for a log file. The output is stored in a byte array, and I am recursing through it.

ref class UART_G {
   public:
     static array<System::Byte>^ message = nullptr;
     static uint8_t message_length = 0;
};

static void logSend ()
{
  StringBuilder^ outputsb = gcnew StringBuilder();
  outputsb->Append("Sent ");
  for (uint8_t i = 0; i < UART_G::message_length; i ++)
  {
    unsigned char mychar = UART_G::message[i];
    if (
      (mychar >= ' ' && mychar <= 'Z') || //Includes 0-9, A-Z.
      (mychar >= '^' && mychar <= '~') || //Includes a-z.
      (mychar >= 128 && mychar <= 254)) //I think these are okay.
    {
      outputsb->Append(L""+mychar);
    }
    else
    {
      outputsb->Append("[");
      outputsb->Append(mychar);
      outputsb->Append("]");
    }
  }
  log_line(outputsb->ToString());
}

I want all plain text characters (eg A, :) to be sent as text, while functional characters (eg BEL, NEWLINE) will be sent like [7][13].

What is happening is that the StringBuilder, in all cases, is outputting the character as a number. For example, A is being sent out as 65.

For example, if I have the string 'APPLE' and a newline in my byte array, I want to see:

Sent APPLE[13]

Instead, I see:

Sent 6580807669[13]

I have tried every way imaginable to get it to display the character properly, including type-casting, concatenating it to a string, changing the variable type, etc... I would really appreciate if anyone knows how to do this. My log files are largely unreadable without this function.

azoundria
  • 940
  • 1
  • 8
  • 24

3 Answers3

2

You're getting the ASCII values because the compiler is choosing one of the Append overloads that takes an integer of some sort. To fix this, you could do a explicit cast to System::Char, to force the correct overload.

However, that won't necessarily give the proper results for 128-255. You could cast a value in that range from Byte to Char, and it'll give something, but not necessarily what you expect. First off, 0x80 through 0x9F are control characters, and whereever you're getting the bytes from might not intend the same representation for 0xA0 through 0xFF as Unicode has.

In my opinion, the best solution would be to use the "[value]" syntax that you're using for the other control characters for 0x80 through 0xFF as well. However, if you do want to convert those to characters, I'd use Encoding::Default, not Encoding::ASCII. ASCII only defines 0x00 through 0x7F, 0x80 and higher will come out as "?". Encoding::Default is whatever code page is defined for the language you have selected in Windows.

Combine all that, and here's what you'd end up with:

for (uint8_t i = 0; i < UART_G::message_length; i ++)
{
  unsigned char mychar = UART_G::message[i];

  if (mychar >= ' ' && mychar <= '~' && mychar != '[' && mychar != ']')
  {
    // Use the character directly for all ASCII printable characters, 
    // except '[' and ']', because those have a special meaning, below.
    outputsb->Append((System::Char)(mychar));
  }
  else if (mychar >= 128)
  {
    // Non-ASCII characters, use the default encoding to convert to Unicode.
    outputsb->Append(Encoding::Default->GetChars(UART_G::message, i, 1));
  }
  else
  {
    // Unprintable characters, use the byte value in brackets.
    // Also do this for bracket characters, so there's no ambiguity 
    // what a bracket means in the logs. 
    outputsb->Append("[");
    outputsb->Append((unsigned int)mychar);
    outputsb->Append("]");
  }
}
azoundria
  • 940
  • 1
  • 8
  • 24
David Yaw
  • 27,383
  • 4
  • 60
  • 93
  • Thanks very much! There's just one minor problem I'm sure I will figure out soon: Error 1 error C2039: 'GetChars' : is not a member of 'System::Text::Encoding::Default'. I am using namespace System::Text. Is there something else I need to use as well? – azoundria Dec 31 '15 at 21:41
  • Encoding::Default->GetChars appears to work much better! All the bytes are as I want them! Thanks again so much for your help. – azoundria Dec 31 '15 at 21:46
  • The default character set might not support 256 characters. For example, Windows-1252 has only 251 characters. And, of course, the default character set varies by system. – Tom Blodget Dec 31 '15 at 21:54
0

You are recieveing ascii value of the string .

See the Ascii chart

 65 = A
 80 = P
 80 = P
 76 = L
 69 = E

Just write a function that converts the ascii value to string

Pradheep
  • 3,553
  • 1
  • 27
  • 35
  • I know what I'm getting out. What would such a conversion function look like? Surely I don't need to create an array or switch statement with 255 strings in it? – azoundria Dec 31 '15 at 20:04
  • http://stackoverflow.com/questions/7693994/how-to-convert-ascii-code-0-255-to-a-string-of-the-associated-character – Pradheep Dec 31 '15 at 20:07
  • That is for Java. When I do it in C++/CLI, Character is not defined. – azoundria Dec 31 '15 at 20:10
0

Here is the code I came up with which resolved the issue:

static void logSend ()
{
  StringBuilder^ outputsb = gcnew StringBuilder();
  ASCIIEncoding^ ascii = gcnew ASCIIEncoding;
  outputsb->Append("Sent ");
  for (uint8_t i = 0; i < UART_G::message_length; i ++)
  {
    unsigned char mychar = UART_G::message[i];
    if (
      (mychar >= ' ' && mychar <= 'Z') || //Includes 0-9, A-Z.
      (mychar >= '^' && mychar <= '~') || //Includes a-z.
      (mychar >= 128 && mychar <= 254)) //I think these are okay.
    {
      outputsb->Append(ascii->GetString(UART_G::message, i, 1));
    }
    else
    {
      outputsb->Append("[");
      outputsb->Append(mychar);
      outputsb->Append("]");
    }
  }
  log_line(outputsb->ToString());
}

I still appreciate any alternatives which are more efficient or simpler to read.

azoundria
  • 940
  • 1
  • 8
  • 24
  • Unfortunately, this does not work for extended ASCII characters. ie (ø or ¾ will convert to ?). I have confirmed in HexD that those symbols are all written as ? (code 0x3F) in the actual file. This isn't mission-critical, but I'd like to know how to solve this. – azoundria Dec 31 '15 at 20:36
  • It would work if you determine which character set and encoding the device is sending and use it instead of ASCII. (ASCII is seldom the right answer.) Note: Extended ASCII is not a specific character set so that is not the right answer either. – Tom Blodget Dec 31 '15 at 20:45
  • Thanks for your help. To clarify, I would like to understand why the actual encoding (bytes in the text file) do not match the bytes in the array within this range. I would ideally like to have the output to the text file match the array exactly, except for the limited set of control characters which are displayed as I indicated. So when I open the file in HexD, I see the bytes as they are in the array instead of (?) 0x3F which is ambiguous. The actual way the recipient of the text file wants to interpret/display those bytes would be up to them. – azoundria Dec 31 '15 at 21:31
  • The encoding in the log file is determined by the implementation of `log_line'. Without seeing it, I'd guess there is a fair chance that it is UTF-8. – Tom Blodget Dec 31 '15 at 21:48
  • That would most likely be correct. I needed to properly convert this data as provided by David Yaw. – azoundria Dec 31 '15 at 21:51