16

I've seen people use size_t whenever they mean an unsigned integer. For example:

class Company {
  size_t num_employees_;
  // ...
};

Is that good practice? One thing is you have to include <cstddef>. Should it be unsigned int instead? Or even just int?

Just using int sounds attractive to me since it avoids stupid bugs like these (because people do often use int):

for(int i = num_employees_ - 1; i >= 0; --i) {
   // do something with employee_[i]
}
jalf
  • 243,077
  • 51
  • 345
  • 550
Frank
  • 64,140
  • 93
  • 237
  • 324
  • 1
    I don't know anyone that counts things from size downwards to 0. :| – GManNickG Jul 11 '09 at 16:31
  • 1
    I'm curious about the answer to this with respect to casting. If you go with unsigned int or int and that is not what size_t actually is on your system then presumably it has to convert it so maybe there's a performance hit? – Troubadour Jul 11 '09 at 16:33
  • It can be useful, but doing it the way the example shows is just bad practice. Choosing your data type in order to satisfy a bad practice results in 2 bad practices. – Gerald Jul 11 '09 at 16:35
  • Also you should get a compiler warning about a signed/unsigned mismatch in this example. – Gerald Jul 11 '09 at 16:41
  • 1
    @GMan: Well, if there were a *function* that returns the number of items then counting backwards would be useful: for(int i=getNum()-1; i>=0; --i) because if you just counted upwards then the function would be called many times (potential performance issue). – Frank Jul 11 '09 at 16:42
  • You probably shouldn't call a function in your for either way, call it before the loop. – Gerald Jul 11 '09 at 16:45
  • 1
    In this economic climate, you frequently count employees down towards zero. Ever heard of "last in, first out"? – Steve Jessop Jul 11 '09 at 16:48
  • @dehmann: Any decent compiler will optimize it out of the for loop or inline it anyway, as long as you're const-correct. @onebyone: A stack? Then use an std::stack. That's different than what I'm talking about, which is looping through an array backwards. – GManNickG Jul 11 '09 at 17:03
  • You might want to take a look at http://stackoverflow.com/questions/994288/ as well. – D.Shawley Jul 11 '09 at 17:24
  • @GMan: Not really, no. Why store employees in a stack just because once per economic cycle you want to perform a LIFO operation? What if you also want to perform forward ops at other times -- stack has no iterator. I suppose you could copy them into a stack and then pop them off and simultaneously remove them from the "real" container, but it seems a bit unadventurous to assume oneself incapable of writing a backward loop. – Steve Jessop Jul 11 '09 at 17:37
  • And btw, const correctness almost never allows a compiler to move a function call outside a loop. A function call on a const reference is not guaranteed always to return the same value. The compiler can't assume that it does, because it can't in general assume that the loop doesn't somehow modify the const object. It also may not know whether the function has side-effects. If *everything* in the loop can be inlined (so the compiler knows no modification via globals), and there's no possibility of pointer aliasing, then maybe. – Steve Jessop Jul 11 '09 at 17:44
  • I was considering either a stack or array, not usage of both, my mistake. Either way, counting down is unorthodox, especially for "optimizations". This is why std::vector and boost::array exist. Standard interfaces to dynamic and static arrays. – GManNickG Jul 11 '09 at 17:46
  • What I meant with const-correctness was marking the function as const, and using constant accessors into the array. And it does know if the function has other side-effects, it can look at it. If it only returns one member variable, and the function is const, then the only thing left to do is determine if the code in the loop will modify the array in any way, which it can also do. – GManNickG Jul 11 '09 at 17:48
  • If reverse iterators weren't so fiddly there'd be less call for backward loops, that's for sure. I do agree entirely that reversing the order of a loop, just to avoid storing the return value from some function that gives you the bounds, isn't a great refactor. – Steve Jessop Jul 11 '09 at 17:50
  • "it does know if the function has other side-effects, it can look at it". It can only look at it if it can inline it. For most compilers, that means it has to be in the same compilation unit: optimisation at link time is rare. – Steve Jessop Jul 11 '09 at 17:51
  • True, I'm used to being in the MSVC++ mindset. Though I've always wondered why compiler's don't optimize at link time aggressively? It seems like a good opportunity. – GManNickG Jul 11 '09 at 17:53
  • Probably because they've thrown away all the parse tree and data-flow information, and don't want to take time to recreate it and re-run all their optimisation heuristics. I guess maybe an option to optimise at static link time *instead* of compile time might be good all round. But I think most linkers are dumber than they look. – Steve Jessop Jul 11 '09 at 17:59
  • When I *have* to count towards zero with `size_t`then I use `for(size_t i = num_employees_; i--; )`. It's a lot cleaner... – Yakov Galka Aug 25 '12 at 21:27

9 Answers9

15

size_t may have different size to int.

For things like number of employees, etc., this difference usually is inconsequential; how often does one have more than 2^32 employees? However, if you a field to represent a file size, you will want to use size_t instead of int, if your filesystem supports 64-bit files.

Do realise that object sizes (as obtained by sizeof) are of type size_t, not int or unsigned int; also, correspondingly, there is a ptrdiff_t for the difference between two pointers (e.g., &a[5] - &a[0] == ptrdiff_t(5)).

Mankarse
  • 39,818
  • 11
  • 97
  • 141
C. K. Young
  • 219,335
  • 46
  • 382
  • 435
  • 16
    You should use off_t for filesizes, not size_t. On a 32-bit system, size_t will be 32 bit, while off_t will be 64 bit (assuming large file support). – Søren Løvborg Jun 15 '10 at 18:59
  • @Søren: +1 Very good point. I'll try to think of another example where a 64-bit `size_t` comes in, then update my post. Any suggestions you have are welcome. :-) – C. K. Young Jun 16 '10 at 03:04
7

Using size_t in many situations helps with portability. size_t isn't always "unsigned int", but it is always the size that can represent the largest possible object on the given platform. For instance, some platforms have a 16-bit integer size, but use 32-bit pointers. In that case if you use unsigned int for the size of something you'll be restricting it to 65,536 bytes (or other elements) even though the platform can handle something much larger.

In your example I would probably use a typedef for a 32-bit or 64-bit unsigned integer rather than using int or unsigned int or size_t.

Gerald
  • 23,011
  • 10
  • 73
  • 102
7

In your case don't use any of them. Either use a container and iterators or create a new data type (e.g. employee database) which offers iterator/range access.

As for unsigned, Bjarne Stroustrup wrote in TCPL:

The unsigned integer types are ideal for uses that treat storage as a bit array. Using an unsigned instead of an int to gain one more bit to represent positive integers is almost never a good idea. Attempts to ensure that some values are positive by declaring variables unsigned will typically be defeated by the implicit conversion rules.

rpg
  • 7,746
  • 3
  • 38
  • 43
3

size_t is specifically intended for specifying the memory size (in bytes) of a value. It is the type of sizeof expressions.

You should only use size_t for this purpose, for other things you should use int or define your own type.

See also Wikipedia.

Søren Løvborg
  • 8,354
  • 2
  • 47
  • 40
2

You always can use things like

employeeList.size();

or

EmployeeList::size_type i = 0;

or

EmployeeNumber number = employee.getNumber();

I mean do incapsulate ints and other types like this unless it is just some internal calculations or algorithm.

Mykola Golubyev
  • 57,943
  • 15
  • 89
  • 102
0

I would use unsigned int or int in this case. size_t is, it seems, primarily for representing the size of data structures. Conceptually, while the number of employees may be in some cases the size of a data structure, it is not necessarily; it might just be a count. So I'd use int, but I probably wouldn't push the point too hard.

Michael Ekstrand
  • 28,379
  • 9
  • 61
  • 93
0

I'm not a professional, but I only use size_t for memory sizes to increase the readability of the code. In every other case I use unsigned int (or anything else).

gufftan
  • 153
  • 5
  • 13
0

I'd say that it must be not a bad C++ style to use size_t not only for sizes of memory regions, but also for indices (like an index variable in a for-loop), because in the C++ reference we see definitions like:

T& operator[](size_t i) const;

(This example is coming from the standard unique_ptr template class.)

So, the standard operator is intended to be used to index elements of arrays, and the type of its argument is exactly size_t.

imz -- Ivan Zakharyaschev
  • 4,921
  • 6
  • 53
  • 104
-1

You don't want to use plain int in this case because number of employees is never negative and you want your compiler to help you enforce this rule, right?

Piotr Dobrogost
  • 41,292
  • 40
  • 236
  • 366
  • 5
    Well, I'd be happy if I got a compiler or runtime warning for this: num_employees_ = 0; --num_employees_; But it doesn't happen. – Frank Jul 11 '09 at 16:24
  • @dehmann But you can write functions/methods taking unsigned int as a parameter and when users pass plain int they will be warned. – Piotr Dobrogost Jul 11 '09 at 16:32
  • 2
    @Piotr: no they won't. "void foo(unsigned int a) {} int main() { int a = -1; foo(a); }" doesn't warn on any compiler I know. Passing a negative literal does. – Steve Jessop Jul 11 '09 at 16:41
  • Check this thread: http://stackoverflow.com/questions/765709/why-compiler-is-not-giving-error-when-signed-value-is-assigned-to-unsigned-intege/765722#765722 – GManNickG Jul 11 '09 at 17:27