The device ... accepts ascii symbols and characters ...
ASCII typically refers to a set of codes (i.e. numeric values) for representing characters and symbols.
Printable/displayable characters (such as the upper case as well as lower case letters of the the English alphabet, punctuation marks, and decimal digits) and control codes are each assigned a unique value.
The basic set of ASCII codes range in value from zero to 127, so each code is representable in 7 bits.
ASCII has been the most widely-used basic code set for representing text in computer systems.
Programming languages typically generate text strings using the ASCII code set.
The expression "ASCII character ..." would typically be used to describe or refer to the numeric value for that character (i.e. its ASCII code value), rather than some symbol or glyph.
IOW "ASCII character" is essentially shorthand for the "ASCII code value for the character ...".
The quoted line above should be interpreted to mean that the device uses ASCII codes.
To calculate the checksum I need to first convert each symbol to a BitArray.
ASCII is not a set of "symbols and characters".
ASCII is a set of numbers. Each number represents (i.e. is mapped to) a character or symbol. Such a set of numbers is called a code set.
ASCII is a code set.
Each ASCII code can be represented in 7 bits, or less than a byte (i.e. eight bits).
Since each ASCII code is typically stored in a byte, the units of byte
and character
are often used interchangeably, especially when referring to storage of text.
Each character in the digital computer is represented by a number, e.g. its ASCII code.
The idea that this character is actually a "symbol" and needs to be converted to some form of numeric representation is incorrect.
A digital computer can only process numbers. If you want to represent a symbol (e.g. for text), then you have to encode that symbol as a number.
But if I try this on a multi-character ascii symbol such as STX or SYN ...
A subset of the ASCII codes represent unprintable characters, which are called the control codes.
Most of the functionality of these control codes relate to electro-mechanical teletypewriters, and paper-tape punches and readers. Some of the control codes have been repurposed for a CRT-based terminal, aka VDT.
Since ASCII control-codes are not printable/displayable, there is no need to represent them as a new single-column-wide symbol or glyph.
Hence the ASCII control-codes are represented by a two- or three-letter name, such as STX or CR.
A few control codes are frequently used in text strings, such as line feed, carriage return, and tab. Those control codes are provided printable representations (even though they are actually unprintable) when specifying text strings in some programming languages (such as C escape sequences, e.g. \n
, \r
, \t
).
Otherwise other control codes have to be specified by their ASCII code as a numeric escape sequence (e.g. \x02
for STX).
IOW STX or SYN are not a "multi-character ascii symbol".
There is no such thing as a "multi-character ascii symbol".
For the time being I'm using a dictionary to go from ascii symbol to unicode
Is there a way to get the desired results in C#?
Each character/symbol is already represented by a number stored in a byte.
There is no need for a "dictionary" or conversion from a "symbol".
You could demonstrate this to yourself by writing a simple program.
Assign a text sting to a byte array.
Print out the byte array as a text string.
Print out the byte array, but each byte in decimal representation.
Print out the byte array, but each byte in binary representation.
Print out the byte array, but each byte in hexadecimal representation.
So the checksum could/should be calculated from the characters/bytes of the message (without any dictionary translation).