7

Integer promotion works by promoting everything of an inferior rank to either int or uint. But why is this so?

It makes sense to make a difference between "upgrading" and "downgrading" a type. When you are converting a short to a char you may lose data.

However when going up in ranks (bool -> char -> short -> int -> long -> long long) there is no chance to lose data. If I have a char it doesnt matter if I convert it to a short or an int, I still won't lose any data.

My question is why is int promotion only from a lower ranked type towards int? Why was the decision made to have it like this? Why not towards the next higher ranked type for example (and the go on from there, try to promote again for example).

Seems to me that the implicit conversion semantics seem a bit arbitrary when you try to describe them. "most int types can be "promoted", meaning a conversion with no possibility of data loss, but the promotion only works towards int, not just any higher ranked type. If you convert anything to something else other than int it is called a conversion"

Would it not be simpler to use the actual ranks of the int types to attempt a series of "promotions"? Or just to call any conversion towards a higher ranked int a "promotion"?

P.S. this is an educational question not one a bout a specific programming issue but rather for my own curiosity.

Rares Dima
  • 1,575
  • 1
  • 15
  • 38
  • 3
    I would say: Mostly historical reasons to match register's size for speed. – Jarod42 May 24 '21 at 16:08
  • 5
    Pretty sure this is behavior inherited from C. Might want to include that tag in this case. – NathanOliver May 24 '21 at 16:13
  • Related: why does the language make numeric literals `int` if they fit, and not the smallest standard integral type that could hold the literal value. – Ben Voigt May 24 '21 at 16:21
  • 7
    `int` is intended to represent the "natural" operation size of the underlying processor architecture. Operations on `long` and `long long` might require multiple machine instructions even for primitive operations (`|&^+-`). On the other hand, operating on smaller types (`short`, `char`) don't save any instructions on sane architectures (and may even be more complex, for example byte memory operations on early ALPHA processors). – EOF May 24 '21 at 16:40
  • @EOF If `int` is the "natural" operation size then it makes sense that operations on `char` for example would not save instructions per se, but would the same instruction not take less time if the operands were smaller types since there are fewer bits? (`ADD`-ing `char`s instead of `int`s for example). – Rares Dima May 24 '21 at 16:46
  • @RaresDima: The fastest possible time for any operation depends on data dependencies. For example, bitwise-AND has no data dependencies at all, so the theoretical time for 1 bit AND and 1 million bit AND is the same. Whereas with addition, there are carries to consider (and carry-lookahead logic to reduce the impact of dependencies). The actual time depends on the clock speed of the processor which is limited by considerations such communicating operands from the register file to the ALU, communicating and storing results back to the register file, communication with the rest of the system. – Ben Voigt May 24 '21 at 16:50
  • @BenVoigt makes sense! But *theoretical* speed is not the same is actual speed. I am not a specialist in CPU architecture, but from what I remember from my computer architecture class I doubt the runtime of even an 8-bit AND vs a 1024-bit AND would be the same. And this will probably keep being the case at least for another decade or two. Or am I missing something? – Rares Dima May 24 '21 at 17:29
  • 1
    All changes from one type to another are “conversions.” We use other words to talk about certain instances of conversions. The “integer promotions” are those conversions that occur due to the rank rules you refer to. There are also “casts”that are conversions requested explicitly in the source code. (Properly, a “cast” might be specifically the operator in source code that requests a conversion, but we use “cast” to refer to the conversion too.) And implied conversions happen in certain circumstances such as assigning to a different type, although I am not aware of another name for them. – Eric Postpischil May 24 '21 at 18:01
  • @EricPostpischil true, all changes are "conversions", but for the purpose of this question it is clear that we will use the terms "promotion" and "conversion" to refer to the designated kinds of type changes. It still does not explain the reasoning for the details of promotions. – Rares Dima May 24 '21 at 18:13
  • 2
    Actually, if you have a processor with AVX512, a 512-bit `AND` *is* as fast as an 8-bit `AND`. – EOF May 24 '21 at 18:18
  • @EOF makes sense, but bitwise operations are still only a minority of the operations usually done with ints. – Rares Dima May 24 '21 at 18:56
  • @RaresDima: That's where the actual considerations come in (some of which I listed in my earlier comment). If 32-bit addition is as fast, even with data dependencies, as moving data around the register file, then it's one clock cycle, and 8-bit addition won't be any faster, since it also will be one clock cycle. – Ben Voigt May 24 '21 at 19:59
  • IOW you have some operation in the pipeline that can't be broken into multiple steps, and this operation determines the maximum clock rate. Making operations "faster" than one clock cycle doesn't do any good. – Ben Voigt May 24 '21 at 20:00
  • Im going to assume that it has to do with an int being either 2 or 4 bytes based on architecture. thus being different than all the other types listed – Jesse Taube May 26 '21 at 17:31
  • maybe it is because of backward compatibility as many embedded system still has 16 bit processors. An integer type data could be fetched in about a single memory cycle. – rsonx May 27 '21 at 10:39
  • https://stackoverflow.com/questions/46073295/implicit-type-promotion-rules a C related answer – Alessandro Teruzzi Jun 01 '21 at 07:19
  • The promotion requirements are not hard, in the sense that they have to be performed; rather the results have to be as if they were performed. A simple fragment `char a,b,c; .... c = a+b/2;` with clang -Os generates byte operations. On some machines, like a pdp11, some arithmetic instructions were only available on word sized items; and generally this is the least surprising outcome. You can simply write a program to compare the results of `(a<>c`, `(char)(a<>c` and `(char)((a<>c)`. Try 96, 4, 2 as values, and check your compiler options. – mevets Jun 01 '21 at 16:41

1 Answers1

0

In the C standard, Section 6.3.1.8 describes "Usual arithmetic conversions." (Added in C99, link is to C11 draft)

Many operators that expect operands of arithmetic type cause conversions and yield result types in a similar way. The purpose is to determine a common real type for the operands and result.

The C99 Rationale V5.10 describes the reason for this as:

Explicit license was added to perform calculations in a “wider” type than absolutely necessary, since this can sometimes produce smaller and faster code, not to mention the correct answer more often. Calculations can also be performed in a “narrower” type by the as if rule so long as the same end result is obtained. Explicit casting can always be used to obtain a value in a desired type.

From the rationale, it is reasonable to infer the committee sees this as the simplest solutions that captures the greatest number of possible uses.

From a simplicity standpoint, having a rank-by-rank promotion system would greatly increase the detail required to implement the standard. It would also create a wide variation of performance issues between platforms of different bit sizes. Programmers seeking to achieve specific objectives with data types are still given that flexibility through explicit casting of types.

thelizardking34
  • 338
  • 1
  • 12
  • Bit size differences are present in all systems. Why would in only be an issue for rank-by-rank promotion? Also this still does not motivate why `int` was chosen of all types. Why not `long long` for instance since it is the widest type? – Rares Dima Jun 03 '21 at 05:37
  • 1
    The C language was designed to map efficiently to typical machine instructions. Historically, 'int' was intended to provide portability between systems; so code could reused by multiple systems of different sizes. At the time of C99 (when 'long long' was added) 64-bit systems were just entering the market. Also, it's also important to note 'long long' isn't necessarily 'wider' than other integer types. It is entirely possible an int and a long long are 64-bits wide; depending on the implementation. The C-standard only requires a minimum size for these types; and their relationship. – thelizardking34 Jun 03 '21 at 14:56
  • thank you! Please add this to your answer and I'll accept it! – Rares Dima Jun 04 '21 at 10:14