0

We all know that 5 in base 10 is represented as 101 in base 2. We also know that a computer stores this number in binary. However, one thing that I couldn't find online keeps bugging me. Down to the assembly code level, we still represent numbers in decimal. How does a computer maps this decimal number to binary? I guess I am really asking how does the assembler translates decimal to binary. Thanks for helping!

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 1
    An assembly program is just a text representation, and like any other program code needs to be turned into binary form for you OS/machine to execute. That’s what the “assembler” (e.g. [GUN `as`](https://www.gnu.org/software/binutils/)) does. If you’re curious about the details, study a CPU architecture (the opcodes of instructions) of your choice and then you’ll see where the bits go… – Jens Mar 27 '20 at 08:30
  • 1
    it doesnt. in your program as mentioned 5 is probaby the value 0x35 which was captured through layers from the keyboard that sent 0x35. The text editor simply recorded that value. Then later the assembler reads the source code sees those bit patterns understands them as characters and values and then assembles them into other bits, those bits are stored in a file. later those bits are put in memory or flash and then possibly executed during decoding and/or execution some of those bits may have some meaning like the value 0b101 = 5 = 0x5 to us humans. – old_timer Mar 27 '20 at 14:48
  • computers are extremely dumb, there is no magic there, very elementary and primitive. – old_timer Mar 27 '20 at 14:48
  • in your case the human hits the 5 key on the keyboard and that maps to a keycode in the keyboard firmware which then may map through some other (human generated) table. The keycode was hand generated by a human so the 5 to keycode was not magic, just some other software. – old_timer Mar 27 '20 at 14:50
  • in general the computer only sees bits, they mean nothing, except for opcodes that indicate what to do with the bits for the few clock cycles of that instruction. – old_timer Mar 27 '20 at 14:57

1 Answers1

2

The ASCII symbol 5 is itself represented by an 8-bit pattern http://asciitable.com/. (Or if you want to talk about computing history, ASCII proper is a 7-bit encoding. But these days we store ASCII codes in separate bytes on computers that use 8-bit bytes.) This initially gets into a computer over a network, from disk, or via a keyboard driver that maps scan-code bit-patterns from a keyboard input into character values. Everything in a binary computer is done in binary, regardless of what the bit-patterns represent.

Interestingly, the low 4 bits of the ASCII codes for '0'..'9' do represent their integer value so character -> integer is as simple as c & 0x0f. (Which on x86 for example can be implemented as and al, 0xf.)

Or you can do it as c - '0' using binary subtraction on the character value. That's often more handy because it can be part of range-checking to see if the character even is a digit in the first place: unsigned digit = c - '0'; if (digit > 9) goto non_digit;.

The standard algorithm for multi-digit strings that represent numbers does this in a loop, accumulating total = total*10 + digit_value. See NASM Assembly convert input to integer? for details with C and x86 asm.


Computers don't work with decimal numbers internally, it's always binary bit patterns. (You can write a program that keeps separate decimal digits in separate bytes, but the computer doesn't truly know they represent decimal digits. Just that you happen to be multiplying or dividing them by 0b1010 = 10 = 0x0a.)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847