0

I'm new to C#. In a project, I'm dealing with serial communication where I'm trying to send/receive a checksum for verifying data transfer integrity. The device I'm sending messages to accepts ascii symbols and characters, for instance an example message might look like this:

String[STX, 1, X, 5, ETX]

To calculate the checksum I need to first convert each symbol to a BitArray. For instance,

STX -> [0,1,0,0,0,0,0,0]

1 -> [1,0,0,0,1,1,0,0]

etc.. and perform some math operations. And, of course I also need to go the other way, for example:

[0,1,0,0,0,0,0,0] -> STX

I'm fine with doing this on a single character using:

byte[] res = System.Text.ASCIIEncoding.ASCII.GetBytes("X");
BitArray bit_arr = new BitArray(res);

But if I try this on a multi-character ascii symbol such as STX or SYN I get three byte arrays (one for each character). For the time being I'm using a dictionary to go from ascii symbol to unicode which works fine, but since I don't know the checksum ahead of time I'll end up having to put all possible multi-character ascii symbols into that dictionary. Is there a way to get the desired results in C#? To be crystal clear what I want is:

SomeFunc(STX) = [0,1,0,0,0,0,0,0]

Thanks for any help.

user2188329
  • 93
  • 1
  • 9
  • Use [this function](https://stackoverflow.com/a/1120277/102937) to remove the commas and the brackets from the string first. Then present the resulting string to `GetBytes()`. – Robert Harvey Apr 21 '20 at 18:48
  • *"To calculate the checksum I need to first convert each symbol to a BitArray."* -- You seem confused as to what ASCII codes are. The binary value of the "symbol" that is received is identical to the (little-endian) version of the "bit array" that you want to "translate" that "symbol" into. There is no need for *"conversion"*, or a *"dictionary"* and any lookup. The received "symbol" is already encoded with its ASCII value. – sawdust Apr 21 '20 at 19:13
  • I don't have a CS background so it's possible that I'm confused. My understanding is that ASCII codes are arbitrarily assigned to values according to some historic standard such that an ASCII table is simply a lookup table. It sounds like your saying I could convert an ASCII symbol to it's value without said table. Please help me grok. – user2188329 Apr 21 '20 at 21:52

1 Answers1

0

The device ... accepts ascii symbols and characters ...

ASCII typically refers to a set of codes (i.e. numeric values) for representing characters and symbols.
Printable/displayable characters (such as the upper case as well as lower case letters of the the English alphabet, punctuation marks, and decimal digits) and control codes are each assigned a unique value.
The basic set of ASCII codes range in value from zero to 127, so each code is representable in 7 bits.

ASCII has been the most widely-used basic code set for representing text in computer systems.
Programming languages typically generate text strings using the ASCII code set.

The expression "ASCII character ..." would typically be used to describe or refer to the numeric value for that character (i.e. its ASCII code value), rather than some symbol or glyph.
IOW "ASCII character" is essentially shorthand for the "ASCII code value for the character ...".

The quoted line above should be interpreted to mean that the device uses ASCII codes.


To calculate the checksum I need to first convert each symbol to a BitArray.

ASCII is not a set of "symbols and characters".
ASCII is a set of numbers. Each number represents (i.e. is mapped to) a character or symbol. Such a set of numbers is called a code set.
ASCII is a code set.

Each ASCII code can be represented in 7 bits, or less than a byte (i.e. eight bits).
Since each ASCII code is typically stored in a byte, the units of byte and character are often used interchangeably, especially when referring to storage of text.

Each character in the digital computer is represented by a number, e.g. its ASCII code.
The idea that this character is actually a "symbol" and needs to be converted to some form of numeric representation is incorrect.
A digital computer can only process numbers. If you want to represent a symbol (e.g. for text), then you have to encode that symbol as a number.


But if I try this on a multi-character ascii symbol such as STX or SYN ...

A subset of the ASCII codes represent unprintable characters, which are called the control codes.
Most of the functionality of these control codes relate to electro-mechanical teletypewriters, and paper-tape punches and readers. Some of the control codes have been repurposed for a CRT-based terminal, aka VDT.

Since ASCII control-codes are not printable/displayable, there is no need to represent them as a new single-column-wide symbol or glyph.
Hence the ASCII control-codes are represented by a two- or three-letter name, such as STX or CR.

A few control codes are frequently used in text strings, such as line feed, carriage return, and tab. Those control codes are provided printable representations (even though they are actually unprintable) when specifying text strings in some programming languages (such as C escape sequences, e.g. \n, \r, \t).
Otherwise other control codes have to be specified by their ASCII code as a numeric escape sequence (e.g. \x02 for STX).

IOW STX or SYN are not a "multi-character ascii symbol".
There is no such thing as a "multi-character ascii symbol".


For the time being I'm using a dictionary to go from ascii symbol to unicode

Is there a way to get the desired results in C#?

Each character/symbol is already represented by a number stored in a byte.
There is no need for a "dictionary" or conversion from a "symbol".

You could demonstrate this to yourself by writing a simple program.
Assign a text sting to a byte array.
Print out the byte array as a text string.
Print out the byte array, but each byte in decimal representation.
Print out the byte array, but each byte in binary representation.
Print out the byte array, but each byte in hexadecimal representation.


So the checksum could/should be calculated from the characters/bytes of the message (without any dictionary translation).

Community
  • 1
  • 1
sawdust
  • 16,103
  • 3
  • 40
  • 50