I have vector<unsigned char>
filed with binary data. I need to take, lets say, 2 items from vector(2 bytes) and convert it to integer. How this could be done not in C style?

- 2,252
- 2
- 19
- 21
-
"C-style" is the best way to do this since you're going to reinterpret data under a different type, why do you fear it? – CharlesB Oct 27 '10 at 09:04
-
4@CharlesB: When in Rome, do as the Romans do. This is C++, using the C++ cast operators is wise. – ereOn Oct 27 '10 at 09:13
6 Answers
Please use the shift operator / bit-wise operations.
int t = (v[0] << 8) | v[1];
All the solutions proposed here that are based on casting/unions are AFAIK undefined behavior, and may fail on compilers that take advantage of strict aliasing (e.g. GCC).

- 15,944
- 2
- 54
- 60
-
Correct me if I'm wrong, but since `v[0]` is an unsigned char, won't the `<< 8` generate a warning if we don't cast it to `int` before ? – ereOn Oct 28 '10 at 07:19
-
2Types smaller than `int` get promoted to int when used with arithmetic or bitwise operators, so the shift works fine without having to cast v[0] explicitly. – Daniel Oct 28 '10 at 09:53
-
3Can you explain how this works? What is significance of the 8 in the shift operation? How does this expression change if i were to use it for a long long type instead of an int type ? – The Mitra Boy Jan 19 '14 at 18:09
-
@TheMitraBoy -- it shifts v[0] to the correct decimal place and then adds v[1]. So if you had 0x12, 0x34 you would get 0x1234. In case you were still wondering XD – thc Sep 29 '17 at 21:26
-
Can someone please explain what can go wrong when using reinterprect_cast with strict-aliasing? thanks – psclkhoury Jul 28 '22 at 17:03
You may do:
vector<unsigned char> somevector;
// Suppose it is initialized and big enough to hold a uint16_t
int i = *reinterpret_cast<const uint16_t*>(&somevector[0]);
// But you must be sure of the byte order
// or
int i2 = (static_cast<int>(somevector[0]) << 8) | somevector[1];
// But you must be sure of the byte order as well

- 53,676
- 39
- 161
- 238
-
-
On some platforms, the first version will generate an exception if you try to do it at an odd offset. On many (including most modern Intels, I think), an odd offset will involve a performance penalty. If the performance is crucial, it will be up to the programmer to determine the faster way. – Eugene Smith Oct 27 '10 at 09:21
-
to be honest, a reinterpret_cast is more "C style" than the bitwise operations, it's just hidden by a C++ template! ;) What's that expression about a turd? ;) I would always go with the bitwise operations, it's frankly more clearer! – Nim Oct 27 '10 at 09:46
v[0]*0x100+v[1]

- 9,126
- 6
- 36
- 40
-
-
Won't work. `v[0]` is an unsigned char. You must first cast it to an integer so that the `* 0x100` doesn't overflow. – ereOn Oct 27 '10 at 09:08
-
1v[0] will be implicitly promoted to signed int. C standard section 3.2.1.1. – Eugene Smith Oct 27 '10 at 09:15
-
is this necessarily any more C++ than C? And anyway, it's a suboptimal version of the bitwise operations (i.e. the above will *most likely* be optimized by the compiler to the bitwise operations)! If the OP is after a class (akin to ByteBuffer in Java nio), that's a different issue... – Nim Oct 27 '10 at 09:51
-
@user434507, it's not about points, I'm just trying to highlight to you that you shouldn't ignore part of the language just because it's perceived to be "C style" whatever that is... If you are explicitly after a class to serialize/deserialize data properly from a binary stream, look at boost's serialization library - it's very powerful. – Nim Oct 27 '10 at 10:03
Well, one other way to do it is to wrap a call to memcpy:
#include <vector>
using namespace std;
template <typename T>
T extract(const vector<unsigned char> &v, int pos)
{
T value;
memcpy(&value, &v[pos], sizeof(T));
return value;
}
int main()
{
vector<unsigned char> v;
//Simulate that we have read a binary file.
//Add some binary data to v.
v.push_back(2);
v.push_back(1);
//00000001 00000010 == 258
int a = extract<__int16>(v,0); //a==258
int b = extract<short>(v,0); //b==258
//add 2 more to simulate extraction of a 4 byte int.
v.push_back(0);
v.push_back(0);
int c = extract<int>(v,0); //c == 258
//Get the last two elements.
int d = extract<short>(v,2); // d==0
return 0;
}
The extract function template also works with double, long int, float and so on.
There are no size checks in this example. We assume v actually has enough elements before each call to extract.
Good luck!

- 859
- 1
- 7
- 22
what do you mean "not in C style"? Using bitwise operations (shifts and ors) to get this to work does not imply it's "C style!"
what's wrong with: int t = v[0]; t = (t << 8) | v[1];
?

- 33,299
- 2
- 62
- 101
If you don't want to care about big/little endian, you can use:
vector<unsigned char> somevector;
// Suppose it is initialized and big enough to hold a uint16_t
int i = ntohs(*reinterpret_cast<const uint16_t*>(&somevector[0]));

- 6,325
- 4
- 22
- 28