1

My professor has assigned an encryption algorithm for our homework in C++. Instead of outputting in binary, he'd like the encrypted text (plain text that has run through the cipher) to output as a string in stdout.

The encryption algorithm will typically have an output greater than 128 (which is outside the ASCII range). These are usually replaced with symbols like � or square boxes.

When I go to concatenate these symbols to the output (ciphertext), they sometimes disappear depending on neighboring symbols.

Here's an example:

    unsigned char one = 244; // (244 is the 16-bit "output" from the algo)
    unsigned char two = 137; // (same as above)
    std::string con = "";
    con += (one + '\0');
    con += (two + '\0');
    std::cout << con << std::endl;

The output will be , where one of the characters is dropped.

If, however, it was unsigned char one = 244; and unsigned char two = 244;, the output in the console will be ��, so the second char doesn't vanish. I'm not sure why some of these combinations work and others don't. Is there a safer way to concatenate these characters that are outside the normal ASCII range?

I have also tried some things I've found on the site, like:

    con += (one + '0'); 
// but this outputs the wrong text: if it were con += (65 + '0') the 
// output is 'q' instead of 'A', but all the symbols generate 
// with this
    con += (two + '0');

I also tried the following, but it has the same results as the first (missing symbols).

    con += one;
    con += two;

Thank you!

  • 1
    What a terminal does with any character is terminal-specific. – user4581301 Nov 19 '21 at 19:38
  • Results may differ in different terminals. You probably should not pay attention to what `std::cout` is outputting. I suggest trying to put the results in a text file. – digito_evo Nov 19 '21 at 19:43
  • Also, what exactly is the purpose of appending '\0' characters to `con`? – digito_evo Nov 19 '21 at 19:49
  • @digito_evo, I'm really not sure. I thought I'd try it because I saw others appending '0', which gave me the wrong output, but the null character appended worked. – calebthebruiser Nov 19 '21 at 19:51
  • @digito_evo Interestingly enough, I routed the string to an 'output.txt' file and the information is there (albeit, in hexadecimal 89 and F4). So, the information isn't lost in the string, the console is just having a hard time with it. Similarly, the file gave a corruption warning because it couldn't display the characters other than in hex. – calebthebruiser Nov 19 '21 at 19:53
  • I am writing an answer. – digito_evo Nov 19 '21 at 19:56
  • `it has the same results as the first (missing symbols).` But... what is the point of this? What would you want to _show_ when printing a byte with 244? Store 8-bit values in `std::vector`. ` the output in the console will be ��, so the second char doesn't vanish` Can your whole question be ignored, and you are only asking why `std::cout << (unsigned char)137 << '\n'` does not show anything? What terminal are you using (what is the program you are __viewing__ the result in?)? Windows/linux? – KamilCuk Nov 19 '21 at 19:59
  • What **exactly** do you want to be shown on the terminal when your output is the character 244? If you don't have a precise answer, I would recommend asking your professor because I suspect you might have misunderstood the requirement to "output as a string". – n. m. could be an AI Nov 19 '21 at 20:07
  • 1
    For reference, if I were your professor, my answer would be like this. "The program inputs and outputs arbitrary bytes, not necessarily ASCII text. It is not intended to show anything in particular on the terminal, because arbitrary bytes do not show nicely on the terminal. Don't even bother running it when its standard output is a terminal, or if you accidentally do so anyway, don't bother looking at the output. Redirect to a file, and analyze the resulting file with appropriate tools". But he might have something else in mind. – n. m. could be an AI Nov 19 '21 at 20:19
  • *continuing* For example, he might want you to encode/escape non-ASCII characters such that 244 appears as the sequence of 4 characters `\ `, `x`, `f` and `4`, to read `\xf4`, or something similar. Likewise you should escape all non-printable characters, and also the backslash character. Or it might be something still different, I cannot possibly know. – n. m. could be an AI Nov 19 '21 at 20:24
  • 1
    @n.1.8e9-where's-my-sharem. That would be the logical thing for him to expect, but he wants it as output in the console. I asked him earlier if it would be best to just encipher it as 16-bit blocks of binary, but he wanted to see some text on the console. I emailed him about the discrepancy across terminals; guess I'll wait for a response. – calebthebruiser Nov 19 '21 at 20:25
  • 1
    "ABCDEFGH" is *some* text. Would it be acceptable to always print it, no matter what the input is? My semi-educated guess is "probably not". So what does "some text on the console" really mean? I don't think it is possible to proceed without having an exact answer. At any rate, the "unknown/unprintable character" (color-reversed question mark or a block or whatever) is hardly "text", one cannot read it, so I would not consider it as a possible candidate to fulfill the role of "some text". – n. m. could be an AI Nov 19 '21 at 20:31

2 Answers2

0

Nothing is lost, all your characters are there:

#include <iostream>
#include <string.h>
int main(int argc, char **argv)
{
    unsigned char one = 244; // (244 is the 16-bit "output" from the algo)
    unsigned char two = 137; // (same as above)
    std::string con = "";
    con += (one + '\0');
    con += (two + '\0');
    for (unsigned int i=0;i<strlen(con.c_str());i++)
    {
        printf("char %d = %d\n", i, (unsigned char) con.c_str()[i]);
    }
    return 0;
}

And the result:

$ g++ -g -O0 main.cpp -o main
$ ./main
char 0 = 244
char 1 = 137

It is sometimes easier to inspect string as raw, c-style strings, and print their content in a format that suits you best.

OrenIshShalom
  • 5,974
  • 9
  • 37
  • 87
0

First of all, you should know that '\0' and '0' are two distinct characters having ASCII codes 0 and 48 respectively.

  • The statement con += (one + '\0'); is equivalent to con += (244 + 0);.
  • But the statement con += (one + '0'); is equivalent to con += (244 + 48); and 244 + 48 == 292 but the max value of unsigned char is 255. So it will cause an overflow and then wrap around and you'll end up with 36 (292 - 256) and 36 is for '$' character. The same is true for con += (two + '0');.

I would suggest you to write something like below and it's the C++ way of doing it:


#include <iostream>


int main()
{
    unsigned char one = 244; // (244 is the 16-bit "output" from the algo)
    unsigned char two = 137; // (same as above)

    std::string con = "";
    con += one;
    con += '\0';
    con += two;
    con += '\0';

    std::cout << "con: <" << con << ">\n" << '\n';

    for ( size_t idx = 0; idx < con.length( ); ++idx )
    {
        std::cout << "index " << idx << ": <"
                  << +static_cast<unsigned char>( con[ idx ] ) << ">" << '\n';
    /* Notice the + operator 
       besides static_cast */
    }
}

In the Windows command prompt, this gives:

con: <⌠ ë >

index 0: <244>
index 1: <0>
index 2: <137>
index 3: <0>

As you can see, each '\0' acts like a space character separating the actual data characters.

Also, notice how the + operator causes a variable of type char or signed char or unsigned char to be printed as an integer. Read more about it here: How to output a character as an integer through cout?

digito_evo
  • 3,216
  • 2
  • 14
  • 42