220

In several C++ examples I see a use of the type size_t where I would have used a simple int. What's the difference, and why size_t should be better?

abhi
  • 1,760
  • 1
  • 24
  • 40
tunnuz
  • 23,338
  • 31
  • 90
  • 128
  • 4
    For an actual example where they aren't interchangeable, see a question I asked previously: http://stackoverflow.com/questions/645168/how-to-write-a-stdbitset-template-that-works-on-32-and-64-bit – Tyler McHenry Aug 05 '09 at 18:31

5 Answers5

175

From the friendly Wikipedia:

The stdlib.h and stddef.h header files define a datatype called size_t which is used to represent the size of an object. Library functions that take sizes expect them to be of type size_t, and the sizeof operator evaluates to size_t.

The actual type of size_t is platform-dependent; a common mistake is to assume size_t is the same as unsigned int, which can lead to programming errors, particularly as 64-bit architectures become more prevalent.

Also, check Why size_t matters

Community
  • 1
  • 1
Joao da Silva
  • 7,353
  • 2
  • 28
  • 24
  • 147
    And so, what is size_t? – NDEthos Dec 02 '15 at 06:30
  • 13
    @NDEthos It depends! On this here Linux `/usr/include/stdlib.h` gets the definition from `/usr/lib/gcc/x86_64-redhat-linux/5.3.1/include/stddef.h` and therein it defaults to `long unsigned int` unless some other header file says otherwise. – David Tonhofer Jan 04 '16 at 19:38
  • 2
    I confirm [size_t to int tuncation is dangerous](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-2315). This might be off‑topic, but how to write a patch alone to fix that kind of mistakes when it occurs thousands times in the linux kernel ? – user2284570 Apr 14 '16 at 20:56
51

size_t 1) is the data type used to represent sizes (as its name implies) and 2) is platform (and even potentially implementation) dependent, so it should be used only for representing sizes.

Representing a size, size_t is naturally unsigned (can you have a box that is negative 3 meters wide?). Many stdlib functions, including malloc, sizeof, and various string operation functions use size_t as a datatype.

An int is signed by default, and even though its size is also platform dependent, it will be a fixed 32bits on most modern machine (and though size_t is 64 bits on any 64-bit architecture, int remains 32 bits long on those same 64-bit architectures).

Summary: Use size_t to represent the size of an object and int (or long) in other cases. Be aware that size_t is unsigned while both int and long are signed by default (unless prepended with unsigned , or modified to uint or ulong).

Stev
  • 65
  • 1
  • 7
Axelle Ziegler
  • 2,505
  • 17
  • 19
19

The size_t type is defined as the unsigned integral type of the sizeof operator. In the real world, you will often see int defined as 32 bits (for backward compatibility) but size_t defined as 64 bits (so you can declare arrays and structures more than 4 GiB in size) on 64-bit platforms. If a long int is also 64-bits, this is called the LP64 convention; if long int is 32 bits but long long int and pointers are 64 bits, that’s LLP64. You also might get the reverse, a program that uses 64-bit instructions for speed, but 32-bit pointers to save memory. Also, int is signed and size_t is unsigned.

There were historically a number of other platforms where addresses were wider or shorter than the native size of int. In fact, in the ’70s and early ’80s, this was more common than not: all the popular 8-bit microcomputers had 8-bit registers and 16-bit addresses, and the transition between 16 and 32 bits also produced many machines that had addresses wider than their registers. I occasionally still see questions here about Borland Turbo C for MS-DOS, whose Huge memory mode had 20-bit addresses stored in 32 bits on a 16-bit CPU (but which could support the 32-bit instruction set of the 80386); the Motorola 68000 had a 16-bit ALU with 32-bit registers and addresses; there were IBM mainframes with 15-bit, 24-bit or 31-bit addresses. You also still see different ALU and address-bus sizes in embedded systems.

Any time int is smaller than size_t, and you try to store the size or offset of a very large file or object in an unsigned int, there is the possibility that it could overflow and cause a bug. With an int, there is also the possibility of getting a negative number. If an int or unsigned int is wider, the program will run correctly but waste memory.

You should generally use the correct type for the purpose if you want portability. A lot of people will recommend that you use signed math instead of unsigned (to avoid nasty, subtle bugs like 1U < -3). For that purpose, the standard library defines ptrdiff_t in <stddef.h> as the signed type of the result of subtracting a pointer from another.

That said, a workaround might be to bounds-check all addresses and offsets against INT_MAX and either 0 or INT_MIN as appropriate, and turn on the compiler warnings about comparing signed and unsigned quantities in case you miss any. You should always, always, always be checking your array accesses for overflow in C anyway.

Davislor
  • 14,674
  • 2
  • 34
  • 49
8

It's because size_t can be anything other than an int (maybe a struct). The idea is that it decouples it's job from the underlying type.

graham.reeds
  • 16,230
  • 17
  • 74
  • 137
  • 9
    I think size_t is actually guaranteed to be an aliased for an unsigned integer, so it can't be a structure. I don't have a reference handy to back this up right now, though. – unwind Feb 02 '09 at 11:57
  • 1
    @danio Why is it so?can you explain? – Rüppell's Vulture May 13 '13 at 09:16
  • 2
    I wouldn't link to cplusplus if I was you! If you can't quote chapter, verse, paragraph and line then it is all just hearsay! :-) – graham.reeds Aug 07 '13 at 07:53
  • 2
    `size_t` is specified as an **unsigned integer** type. C11 §6.5.3.4 5 "The value of the result of both operators (`sizeof` `_Alignof`) is implementation-defined, and its type (an unsigned integer type) is `size_t`,". – chux - Reinstate Monica Jul 07 '15 at 21:51
-2

The definition of SIZE_T is found at: https://msdn.microsoft.com/en-us/library/cc441980.aspx and https://msdn.microsoft.com/en-us/library/cc230394.aspx

Pasting here the required information:

SIZE_T is a ULONG_PTR representing the maximum number of bytes to which a pointer can point.

This type is declared as follows:

typedef ULONG_PTR SIZE_T;

A ULONG_PTR is an unsigned long type used for pointer precision. It is used when casting a pointer to a long type to perform pointer arithmetic.

This type is declared as follows:

typedef unsigned __int3264 ULONG_PTR;
ikegami
  • 367,544
  • 15
  • 269
  • 518
sundar
  • 396
  • 1
  • 6
  • 19