0

I have this code:

#include<iostream>
void test(string in){
    for(short i=0; i<in.length(); i++)
        cout<<hex<<(unsigned short)in[i];
    cout<<"\n";
    string msg=in;
    for(short i=0; i<msg.length(); i++)
        cout<<hex<<(unsigned short)msg[i];
    cout<<"\n";
    msg+=(char)128;
    for(short i=0; i<msg.length(); i++)
        cout<<hex<<(unsigned short)msg[i];
}
int main(){
    test("123456");
}

I expect the output to be:

313233343536
313233343536
31323334353680

But instead, it is the following:

313233343536
313233343536
313233343536ff80

It's clear that the += operator does something that i didn't count with. I use Code::Blocks on a 64-bit machine. How can I fix it?

  • I'm not sure why you think `+=` is the issue here. I assume you didn't choose `128` by accident. What happens if you change that value? What happens if you just assign `(char)128` to an int? – cigien May 26 '23 at 09:23
  • 2
    Sign extension. `unsigned short` is 2 bytes on your platform and the leading zero's are not being printed. When sign extending `signed` `128` (`char` is probably signed on your platform) to `unsigned short` you get `ff80` and as the leading digits are not `0` they get printed. – Richard Critten May 26 '23 at 09:31
  • 2
    [Is char signed or unsigned by default?](https://stackoverflow.com/questions/2054939/is-char-signed-or-unsigned-by-default) – Jason May 26 '23 at 09:39
  • @cigien I chose 128 because I need one '1' bit first and then '0'-s – Ágoston Kis May 28 '23 at 08:45

1 Answers1

4

Your compiler uses an signed char type, i.e. the range of values is [-128, 127].

msg+=(char)128;

adds a char with value -128 which is represented binary 0b1000'0000.

When you read this char in (unsigned short)msg[i], the value first undergoes a promotion padding the value with 1 bits until the width of the target type is reached and then the conversion to unsigned short happens leaving you with (0b1111'1111'1000'0000 = 0xff80)

0b1111'1111'1000'0000
  ^^^^ ^^^^           bits from sign extension
            ^^^^ ^^^^ 128

To fix this you can cast to unsigned char first:

for (short i = 0; i < msg.length(); i++)
    cout << hex << static_cast<unsigned short>(static_cast<unsigned char>(msg[i]));
fabian
  • 80,457
  • 12
  • 86
  • 114