6

I wounder why size of int depends on which OS one is using ,in C & C++. It's ok if size of pointer varies, but why size of integer. If 16 bit OS sizeof(int) = 2 byte, for 32 bit sizeof(int) = 4 byte. Why so?

Thanks.

Pranit P Kothari
  • 326
  • 3
  • 11

4 Answers4

7

Why so?

Historical reasons.

Before the advent of the ANSI C standard and size_t in 1989, int was the type used to index into arrays. malloc took an int as its argument, strlen returned one. Thus int had to be large enough to index any array, but small enough to not cause too much overhead. For file offsets, typically a larger type such as long was typedef'd to off_t.

On the PDP-11 were C was first implemented in the early 1970s, int was as large as a processor register: 16 bits. On larger machines such as the VAX, it was widened to 32 bits to allow for larger arrays.

This convention has been largely abandoned; the C and C++ standards use size_t and ssize_t for indices and lenghts of arrays. On 64-bit platforms, often int is still 32 bits wide while size_t is 64 bits. (Many older APIs, e.g. CBLAS, still use int for indices, though.)

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
2

A byte is the smallest unit of memory which your target system can handle and uniquely address. As such, the size of a byte is platform and compiler-dependant, but in most settings is 8 bits.

So, assuming a byte is 8 bits, this would mean that 64 bits equals 8 bytes, 32 bits equals 4 bytes and 16 bits equals 2 bytes.

On a fuzzy level, an "X bit system" is a system where basic values (registers, ints, etc.) are X bits by default. Thus, how many bits your system natively uses immediately affects how many bytes are needed to hold those values.

Agentlien
  • 4,996
  • 1
  • 16
  • 27
  • Size of byte is compiler-dependent? – Pranit P Kothari Jan 10 '13 at 11:22
  • 1
    @PranitPKothari It depends on your target platform, but is guaranteed to be "at least 8 bits". This is because some special hardware has properties which make other byte sizes more suitable. For example, if your hardware addresses has a 16-bit resolution, you need a byte to be at least 16 bits. See my edit regarding what a byte really is. – Agentlien Jan 10 '13 at 11:24
  • It doesn't have to be at least 8 bits – a3f Jan 10 '13 at 11:26
  • 2
    @a3f In standard C and C++, yes it does. – nos Jan 10 '13 at 11:28
  • 1
    According to the C++ standard, section 1.7.1: *"A byte is at least large enough to contain any member of the basic execution character set (2.3) and the eight-bit code units of the Unicode UTF-8 encoding form and is composed of a contiguous sequence of bits, the number of which is implementationdefined."* Given that it has to be large enough to contain any 8-bit character, it does have to be at least 8 bits. – Agentlien Jan 10 '13 at 11:28
2

According to the C++ standard

1.7.1 states:

The fundamental storage unit in the C++ memory model is the byte. A byte is at least large enough to contain any member of the basic execution character set ...

then 3.9.1.1 states:

Objects declared as characters (char) shall be large enough to store any member of the implementation’s basic character set.

So we can infer that char is actually a byte. Most importantly 3.9.1.2 also says:

There are five signed integer types: “signed char”, “short int”, “int”, “long int”, and “long long int”. In this list, each type provides at least as much storage as those preceding it in the list. Plain ints have the natural size suggested by the architecture of the execution environment; the other signed integer types are provided to meet special needs.

So in other words the size of int is (a) guaranteed to be at least a byte and (b) naturally aligned to the OS/hardware it's running on so most likely these days to be 64 bit or (for many older systems) 32 bit.

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
Component 10
  • 10,247
  • 7
  • 47
  • 64
  • Or for [some even older systems](http://stackoverflow.com/questions/6971886/exotic-architectures-the-standard-committee-cares-about/6972551#6972551) it can even be 36-bits. – Bo Persson Jan 10 '13 at 12:15
  • @Bo hmmm systems on which multics run.. for instance :-) – Déjà vu Jan 10 '13 at 13:33
1

Your question can be reformulated as why the different types of data structure depends on the CPU. This can be justified simply by the fact that C is a language that can be used in low-level programming. You can see in Data Types in the Kernel how that data types are defined for different types of CPU in Linux. This is link to word length of the processor.

Mihai8
  • 3,113
  • 1
  • 21
  • 31