0

I'm completely new to C++, so I guess this might be a very trivial question. If this is a duplicate of an already answered question (I bet it is...), please point me to that answer!

I have a file with the following cut from hexdump myfile -n 4:

00000000  02 00 04 00 ...    |....|
00000004

My problem/confusion comes when trying to read these values and convert them to ints ( [0200]_hex --> [512]_dec and [0400]_hex --> [1024]_dec).

A minimum working example based on this answer:

#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main(void){
    char fn[] = "myfile";
    ifstream file;
    file.open(fn, ios::in | ios::binary);

    string fbuff = "  ";
    file.read((char *)&fbuff[0], 2);
    cout << "fbuff: " << fbuff << endl;

    // works
    string a = "0x0200";
    cout << "a: " << a  << endl;
    cout << "stoi(a): " << stoi(a, nullptr, 16) << endl;

    // doesn't work
    string b = "\x02\x00";
    cout << "b: " << b << endl;
    cout << "stoi(b): " << stoi(b, nullptr, 16) << endl;

    // doesn't work
    cout << "stoi(fbuff): " << stoi(fbuff, nullptr, 16) << endl;

    file.close();
    return(0);
}

What I cant get my head around is the difference between a and b; the former defined with 0x (which works perfect) and the latter defined with \x and breaks stoi. My guess is that whats being read from the file is in the \x-format, based on the output when running the code within sublime-text3 (below), and every example I've seen only deals with for example 0x0200-formatted inputs

// Output from sublime, which just runs g++ file.cpp && ./file.cpp
fbuff: <0x02> <0x00>
a: 0x0200
stoi(a): 512
b: 
terminate called after throwing an instance of 'std::invalid_argument'
  what():  stoi
[Finished in 0.8s with exit code -6]

Is there a simple way to read two, or more, bytes, group them and convert into a proper short/int/long?

  • note that `02 00 04 00` is not `0x200` and `0x0400` (if interpreted as 16bit), but `0x0002` and `0x0004` and (if 32bit) not `0x02000400` but `0x00040002` – vlad_tepesch Nov 17 '18 at 23:05
  • Thanks, so you mean it automatically takes care of endianess? I also noted there was a typo in my definition of `b`. It should have been `string b = "\x02\x00"`. – user10668188 Nov 18 '18 at 08:45

1 Answers1

0

The literal string "0x0200" is really an array of seven bytes:

0x30 0x78 0x30 0x32 0x30 0x30 0x00

The first six are ASCII encoded characters for '0', 'x', '0', '2', '0' and '0'. The last is the null-terminator that all strings have.

The literal string "\x00\x02" is really an array of three bytes:

0x00 0x02 0x00

That is not really what is normally called a "string", but rather just a collection of bytes. And it's nothing that can be parsed as a string by std::stoi. And as std::stoi can't parse it the function will throw an exception.

You might want to get a couple of good books to read and learn more about strings.


Note: This answer assumes ASCII encoding and 8-bit bytes, which is by far the most common.

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
  • Great! Thank you for clearing that out and for the book-list. Two follow-up questions to help my understanding: 1) so what is really read from the file will be in the second form, i.e. just the two bytes `\x02` and `\x00` without null-terminator (but that already sits "outside" of my 2byte-sized string anyway)? 2) I need to write my own function to translate such a hex-string to int, or is there a built-in function for that similar to std::stoi? A bonus is if it can handle endianess as well, similar to pythons `struct.unpack`-functions. – user10668188 Nov 18 '18 at 08:23
  • And one more question: When I run `a.length()`, I get 6 as you described. However when running `b.length()` I get 0, but I would expect 2...? – user10668188 Nov 18 '18 at 08:33
  • @user10668188 For the last question its easy to answer: The null-terminator is the character `'\0'`, which is the octal equivalent of `'\x00'`. That is, the string you use to initialize `b` is terminated at the first character, i.e. it's a string of length `0`. – Some programmer dude Nov 18 '18 at 08:38