How to convert vector to int?

Question

I have vector<unsigned char> filed with binary data. I need to take, lets say, 2 items from vector(2 bytes) and convert it to integer. How this could be done not in C style?

"C-style" is the best way to do this since you're going to reinterpret data under a different type, why do you fear it? — CharlesB, Oct 27 '10 at 09:04
@CharlesB: When in Rome, do as the Romans do. This is C++, using the C++ cast operators is wise. — ereOn, Oct 27 '10 at 09:13

Daniel · Accepted Answer · 2010-10-27T10:29:16.470

8

Please use the shift operator / bit-wise operations.

int t = (v[0] << 8) | v[1];

All the solutions proposed here that are based on casting/unions are AFAIK undefined behavior, and may fail on compilers that take advantage of strict aliasing (e.g. GCC).

edited Oct 27 '10 at 10:29

answered Oct 27 '10 at 10:13

Daniel

15,944
2
54
60

Correct me if I'm wrong, but since `v[0]` is an unsigned char, won't the `<< 8` generate a warning if we don't cast it to `int` before ? – ereOn Oct 28 '10 at 07:19
2

Types smaller than `int` get promoted to int when used with arithmetic or bitwise operators, so the shift works fine without having to cast v[0] explicitly. – Daniel Oct 28 '10 at 09:53
3

Can you explain how this works? What is significance of the 8 in the shift operation? How does this expression change if i were to use it for a long long type instead of an int type ? – The Mitra Boy Jan 19 '14 at 18:09
@TheMitraBoy -- it shifts v[0] to the correct decimal place and then adds v[1]. So if you had 0x12, 0x34 you would get 0x1234. In case you were still wondering XD – thc Sep 29 '17 at 21:26
Can someone please explain what can go wrong when using reinterprect_cast with strict-aliasing? thanks – psclkhoury Jul 28 '22 at 17:03

ereOn · Answer 2 · 2010-10-27T09:16:11.060

7

You may do:

vector<unsigned char> somevector;
// Suppose it is initialized and big enough to hold a uint16_t

int i = *reinterpret_cast<const uint16_t*>(&somevector[0]);
// But you must be sure of the byte order

// or
int i2 = (static_cast<int>(somevector[0]) << 8) | somevector[1];
// But you must be sure of the byte order as well

edited Oct 27 '10 at 09:16

answered Oct 27 '10 at 09:06

ereOn

53,676
39
161
238

You can use ntohs() to make it independent of big/little endian – Benoit Thiery Oct 27 '10 at 09:16
On some platforms, the first version will generate an exception if you try to do it at an odd offset. On many (including most modern Intels, I think), an odd offset will involve a performance penalty. If the performance is crucial, it will be up to the programmer to determine the faster way. – Eugene Smith Oct 27 '10 at 09:21
to be honest, a reinterpret_cast is more "C style" than the bitwise operations, it's just hidden by a C++ template! ;) What's that expression about a turd? ;) I would always go with the bitwise operations, it's frankly more clearer! – Nim Oct 27 '10 at 09:46

score 5 · Answer 3 · answered Oct 27 '10 at 09:04

5

v[0]*0x100+v[1]

answered Oct 27 '10 at 09:04

Eugene Smith

9,126
6
36
40

Or the other way around if we're talking little endian. – Benjamin Lindley Oct 27 '10 at 09:08
Won't work. `v[0]` is an unsigned char. You must first cast it to an integer so that the `* 0x100` doesn't overflow. – ereOn Oct 27 '10 at 09:08
1

v[0] will be implicitly promoted to signed int. C standard section 3.2.1.1. – Eugene Smith Oct 27 '10 at 09:15
is this necessarily any more C++ than C? And anyway, it's a suboptimal version of the bitwise operations (i.e. the above will *most likely* be optimized by the compiler to the bitwise operations)! If the OP is after a class (akin to ByteBuffer in Java nio), that's a different issue... – Nim Oct 27 '10 at 09:51
@user434507, it's not about points, I'm just trying to highlight to you that you shouldn't ignore part of the language just because it's perceived to be "C style" whatever that is... If you are explicitly after a class to serialize/deserialize data properly from a binary stream, look at boost's serialization library - it's very powerful. – Nim Oct 27 '10 at 10:03

mantler · Answer 4 · 2011-11-09T20:19:47.470

Well, one other way to do it is to wrap a call to memcpy:

#include <vector>
using namespace std;

template <typename T>
T extract(const vector<unsigned char> &v, int pos)
{
  T value;
  memcpy(&value, &v[pos], sizeof(T));
  return value;
}

int main()
{
  vector<unsigned char> v;
  //Simulate that we have read a binary file.
  //Add some binary data to v.
  v.push_back(2);
  v.push_back(1);
  //00000001 00000010 == 258

  int a = extract<__int16>(v,0); //a==258
  int b = extract<short>(v,0); //b==258

  //add 2 more to simulate extraction of a 4 byte int.
  v.push_back(0);
  v.push_back(0);
  int c = extract<int>(v,0); //c == 258

  //Get the last two elements.
  int d = extract<short>(v,2); // d==0

  return 0;
}

The extract function template also works with double, long int, float and so on.

There are no size checks in this example. We assume v actually has enough elements before each call to extract.

Good luck!

score 3 · Answer 5 · answered Oct 27 '10 at 09:08

3

what do you mean "not in C style"? Using bitwise operations (shifts and ors) to get this to work does not imply it's "C style!"

what's wrong with: int t = v[0]; t = (t << 8) | v[1]; ?

answered Oct 27 '10 at 09:08

Nim

33,299
2
62
101

score 1 · Answer 6 · answered Oct 27 '10 at 09:21

1

If you don't want to care about big/little endian, you can use:

vector<unsigned char> somevector;
// Suppose it is initialized and big enough to hold a uint16_t

int i = ntohs(*reinterpret_cast<const uint16_t*>(&somevector[0]));

answered Oct 27 '10 at 09:21

Benoit Thiery

6,325
4
22
28

How to convert vector to int?

6 Answers6

Linked

Related