Why are hexadecimal numbers prefixed with 0x?

Question

Why are hexadecimal numbers prefixed as 0x? I understand the usage of the prefix but I don't understand the significance of why 0x was chosen.

Now I realize that the title and the text ask two entirely different questions. Most replies focus on the question in the title. The answer to the question in the text is simply "it does not mean anything - it is merely a prefix telling the compiler that the integer is written in hexadecimal". — Andreas Rejbrand, Apr 22 '10 at 23:39
To be pedantic, one might also interpret the question in the title in two different ways: 1) "Why are hexadecimal numbers prefixed as 0x, as opposed to any other prefix or indicator?" 2) "Why do we need to use a prefix when entering hexadecimal numbers? Surely the compiler will recognize 58A as a hexadecimal number even without the prefix?" The answer to the second interpretation of the question is trivial. "123" is also a hexadecimal number. — Andreas Rejbrand, Apr 22 '10 at 23:42

score 565 · Accepted Answer · answered Jan 03 '11 at 00:42

565

Short story: The 0 tells the parser it's dealing with a constant (and not an identifier/reserved word). Something is still needed to specify the number base: the x is an arbitrary choice.

Long story: In the 60's, the prevalent programming number systems were decimal and octal — mainframes had 12, 24 or 36 bits per byte, which is nicely divisible by 3 = log2(8).

The BCPL language used the syntax 8 1234 for octal numbers. When Ken Thompson created B from BCPL, he used the 0 prefix instead. This is great because

an integer constant now always consists of a single token,
the parser can still tell right away it's got a constant,
the parser can immediately tell the base (0 is the same in both bases),
it's mathematically sane (00005 == 05), and
no precious special characters are needed (as in #123).

When C was created from B, the need for hexadecimal numbers arose (the PDP-11 had 16-bit words) and all of the points above were still valid. Since octals were still needed for other machines, 0x was arbitrarily chosen (00 was probably ruled out as awkward).

C# is a descendant of C, so it inherits the syntax.

answered Jan 03 '11 at 00:42

Řrřola

6,007
1
17
10

141

I don't think `0x` over `00` was preference/awkwardness. `00` would break existing code. `0010` as octal is `8`, while `0010` as hexidecimal would be `16`. They couldn't use any number as a second digit indicator (except `8` or `9`, and neither holds any significance related to hexidecimal) so a letter is a must. And that leaves either `0h` or `0x` (**H** e **X** idecimal). From this point it seems it's truly back to preference. – GManNickG Jan 12 '13 at 02:30
2

Related: http://stackoverflow.com/questions/18987911/bcpl-octal-numerical-constants and http://stackoverflow.com/questions/11483216/why-are-leading-zeroes-used-to-represent-octal-numbers – Řrřola Jun 10 '14 at 13:13
32

Using a `0` prefix for octal has caused so very many problems over the years. Notably in countries like the UK where telephone numbers start with a `0`. Javascript and many other languages would parse these as octal, mangling the number before storing. To add to the fun, one popular database product would _silently_ switch back to decimal parsing if the number contained an `8` or `9`. – Basic Nov 24 '15 at 19:45
1

12, 24 and 36 are also divisible by 4 so why didn't they think of hexadecimal for that? – phuclv Jan 26 '17 at 06:33
5

@LưuVĩnhPhúc Probably because hexadecimal wasn't very relevant. Most of the hardware, software, and documentation of the time fit octal much better. BCPL was first implemented on a [36 bit IBM 7094](https://en.wikipedia.org/wiki/IBM_7090#IBM_7094), with an instruction format split into two 3 bit parts and 2 15 bit parts; 6 bit characters; and documentation in octal. B's early implementations were on a PDP-7 (18 bit) and a Honeywell GE-945 (36 bit, but with 18 bit addressing, and support for 6 and 9 bit bytes). The 16 bit PDP-11 came out after B, so would not have influenced B's design much. – 8bittree Feb 17 '17 at 19:09
1

What do you mean "the parser can immediately tell the base (0 is the same in both bases).". If it's teh same in octal and decimal... how can you tell the base? – Jwan622 Sep 15 '19 at 18:20
1

@Jwan622 It can tell because in decimal integer constants the first digit _must not_ be `0`, whereas octal constant integers _must_ start with a `0`. – Emax Oct 02 '19 at 16:05
Why the prefix 0x denotes hex number, instead of 0h? – DawnSong Sep 06 '21 at 14:35
1

@DawnSong, because it's hexadecimal (base [6+10]) rather than heximal (base 6) :-) – paxdiablo Jun 21 '22 at 10:44
1

It's a little-known fact that, every time you put a literal `0` in your C source, it's an *octal* zero. It just happens to be a happy coincidence that this is the same value as a decimal zero :-) – paxdiablo Jun 21 '22 at 10:45
Thompson could have instead selected "123hd" for hexadecimals and "101b" for binary. Integers are still a single token, the parser can still know it's not an identifier because of the leading digit, the parser can just store the number as a string until it reaches the end of the token, it's even more "mathematically sane" to never ever risk combining different bases, no special characters are used, AND you get two essential properties: (1) Leading zeroes in numbers-as-strings can't accidentally change the interpreted base, and (2) no more meaningless `0` character RIGHT next to real zeroes. – iono Jul 27 '22 at 14:43
Anyway folks this is why code should never ever be edited as text and we needed structural editors 20 years ago. – iono Jul 27 '22 at 14:58
I don't see how you can call 'x' an "arbitrary choice". "x" sounds like"hex" when you say it, except without the 'h', which some important languages and dialects don't pronounce in words anyway. – H. H. Aug 19 '22 at 11:40

AshleysBrain · Answer 2 · 2010-04-22T12:59:53.890

117

Note: I don't know the correct answer, but the below is just my personal speculation!

As has been mentioned a 0 before a number means it's octal:

04524 // octal, leading 0

Imagine needing to come up with a system to denote hexadecimal numbers, and note we're working in a C style environment. How about ending with h like assembly? Unfortunately you can't - it would allow you to make tokens which are valid identifiers (eg. you could name a variable the same thing) which would make for some nasty ambiguities.

8000h // hex
FF00h // oops - valid identifier!  Hex or a variable or type named FF00h?

You can't lead with a character for the same reason:

xFF00 // also valid identifier

Using a hash was probably thrown out because it conflicts with the preprocessor:

#define ...
#FF00 // invalid preprocessor token?

In the end, for whatever reason, they decided to put an x after a leading 0 to denote hexadecimal. It is unambiguous since it still starts with a number character so can't be a valid identifier, and is probably based off the octal convention of a leading 0.

0xFF00 // definitely not an identifier!

edited Apr 22 '10 at 12:59

answered Apr 19 '10 at 21:20

AshleysBrain

22,335
15
88
124

6

Interesting. I imagine they could have used a leading 0 AND trailing h to denote hex. The trailing h probably would have been confused with the type specifier suffix, e.g. 0xFF00l vs 0FF00hl – zdan Apr 20 '10 at 00:24
3

This argument implies that the use of a leading zero to denote octal numbers predates the use of the hexadecimal "0x" prefix. Is this true? – Andreas Rejbrand Apr 22 '10 at 13:09
1

Wouldn't they have both been invented at the same time? Why would there ever be one but not the other? – AshleysBrain Apr 22 '10 at 13:28
AshleysBrain see answer by @Řrřola for why there could be octal but not hexadecimal at the same time. – jv42 Jan 27 '12 at 09:11
2

@zdan they have used it long time ago. In x86 Intel assembly a hex literal must always be prefixed by 0 if they begin with a character. For example `0xFFAB1234` must be written as `0FFAB1234h`. I remember it from inline asm in Pascal when I was young http://stackoverflow.com/q/11733731/995714 – phuclv Oct 11 '15 at 03:34
Assembly languages that do support the trailing-h syntax *do* require numeric literals to start with a 0-9 digit, for the same reason you say it would be a problem in C. `DEADBEEFh` is a symbol name, `AH` is an x86 register name! Thus **[assemblers support `0FF00h` and/or `0xFF00`](https://stackoverflow.com/questions/11733731/how-to-represent-hex-value-such-as-ffffffbb-in-x86-inline-assembly-programming/37152498#37152498), but *not* `FF00h`** – Peter Cordes Mar 12 '18 at 09:20
As you said, 0h might be better than 0x. Why is 0x chosen finally? – DawnSong Sep 06 '21 at 14:37

loyola · Answer 3 · 2022-08-05T15:53:13.537

40

It's a prefix to indicate the number is in hexadecimal rather than in some other base. The programming language uses it to tell compiler.

Example:

0x6400 translates to 6*16^3 + 4*16^2 + 0*16^1 +0*16^0 = 25600. When compiler reads 0x6400, It understands the number is hexadecimal with the help of 0x term. Usually we can understand by (6400)₁₆ or (6400)₈ or whatever ..

For binary it would be:

0b00000001

Good day!

edited Aug 05 '22 at 15:53

answered Jun 05 '15 at 05:53

loyola

3,905
2
24
18

3

Binary literals are only supported in C++ since C++14, and not supported in C at all. – Ruslan Apr 25 '17 at 13:11
7

This doesn't explain *why*. Particularly, why couldn't you write the first example as `x6400`? The `x` could still be used to infer hexadecimal. – Aaron Franke Nov 07 '19 at 04:57

score 20 · Answer 4 · answered Apr 27 '17 at 00:53

20

The preceding 0 is used to indicate a number in base 2, 8, or 16.

In my opinion, 0x was chosen to indicate hex because 'x' sounds like hex.

Just my opinion, but I think it makes sense.

Good Day!

answered Apr 27 '17 at 00:53

Johnny Low

237
2
2

3

Thanks for the answer! I understand that this is your first post on StackOverflow. The answer could have been more helpful if opinions are separated from facts. – vivek_ganesan Apr 27 '17 at 04:57
4

This answer does a great job of separating opinions from facts. – anatolyg Nov 05 '20 at 14:42
I appreciate the very plausible idea that "x" rhymes with "hex" which is a common way to refer to a hexadecimal number. This is the only answer that mentions it. – Douglas Held Feb 04 '23 at 20:41

score 13 · Answer 5 · answered Jul 29 '21 at 01:21

I don't know the historical reasons behind 0x as a prefix to denote hexadecimal numbers - as it certainly could have taken many forms. This particular prefix style is from the early days of computer science.

As we are used to decimal numbers there is usually no need to indicate the base/radix. However, for programming purposes we often need to distinguish the bases from binary (base-2), octal (base-8), decimal (base-10) and hexadecimal (base-16) - as the most commonly used number bases.

At this point in time it is a convention used to denote the base of a number. I've written the number 29 in all of the above bases with their prefixes:

0b11101: Binary
0o35: Octal, denoted by an o
0d29: Decimal, this is unusual because we assume numbers without a prefix are decimal
0x1D: Hexadecimal

Basically, an alphabet we most commonly associate with a base (e.g. b for binary) is combined with 0 to easily distinguish a number's base.

This is especially helpful because smaller numbers can confusingly appear the same in all the bases: 0b1, 0o1, 0d1, 0x1.

If you were using a rich text editor though, you could alternatively use subscript to denote bases: 1₂, 1₈, 1₁₀, 1₁₆

@DawnSong It's just a convention that is quite popular and now we have to live with it. Sometimes these things are arbitrary. You could try promoting your way, but it's hard to change the habits of so many people. I find the Octal one to be the most confusing visually. — Advait Junnarkar, Sep 08 '21 at 05:46

Why are hexadecimal numbers prefixed with 0x?

5 Answers5

Linked

Related