0

If my question sound nonsensical, pardon me.

But I am quite confused, say I am defining the constant buffer_size there is a line on the code that I am studying which says: buffer_size equ 16, which in my mind means, make buffer_size 16 big. But in other code samples that I review, numbers do have the character h beside them, which I'm told is to tell the assembler to treat the number as hexadecimal.

If a number doesn't have the h beside it, does it make it decimal then?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
jeremybcenteno
  • 119
  • 4
  • 15

2 Answers2

3

Yes, MASM (and pretty much all other modern assemblers1) are like C/C++: numeric literals are decimal by default.

You can use other bases with suffixes. See How to represent hex value such as FFFFFFBB in x86 inline assembly programming? for the syntax. Some assemblers, like NASM, allow 0x123 as well as 123h, but MASM only allows suffixes.

10h in MASM is exactly like 0x10 in C, and exactly equivalent to 16.

The machine code assembled doesn't depend on the source representation of the number. (mov eax, 10h is 5 bytes: opcode and then 32-bit little-endian binary number, same as mov eax, 16.)

The same goes for foo: db 0FFh: code that adds something to it isn't "adding hex numbers", it's just a normal binary number. (A common beginner mistake (in terminology or understanding, it's usually not clear which) is to confuse the source-code representation with what the machine is doing when it runs the assembler output.


Footnote 1: Ancient assemblers might be different. There might be a few assemblers for some non-x86 platforms that also don't default to decimal.

The one built-in to the obsolete DOS DEBUG.EXE treats all numeric literals as hex, so mov ax, 10 = mov ax, 8+8. (If it even evaluates constant expressions, but if not then you know what I mean.)

DEBUG.EXE doesn't even support labels, so its basically horrible by modern standards; don't use it. These days there are free open-source assemblers like NASM, and also debuggers including at least the one built-in to BOCHS, so there's no need to suffer with old tools.

Anyway, this sidetrack about DEBUG.EXE isn't really relevant to your question about MASM; I only mention it as the only example I know of of an assembler that doesn't default to decimal. They do exist, but it's not normal these days.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 1
    I would be very careful about generalizations "like almost all other assemblers". You have used the last 50+ years worth of assemblers for the plethora of instruction sets? – old_timer Sep 03 '18 at 03:04
  • @old_timer: I'm talking about modern assemblers that are relevant today. I wasn't trying to imply that DEBUG.EXE was the only one where hex was the default. I'm not aware of any others. This is tagged x86, and definitely all the mainstream x86 assemblers anyone still uses regularly (except DEBUG.EXE) are like this. Apparently some unfortunate beginners still use DEBUG.EXE for their school courses or something. – Peter Cordes Sep 03 '18 at 03:08
  • @PeterCordes In modern times, `Debug` only ships with 32-bit editions of Windows (since it requires virtual 8086 mode) - you'd have to go out of your way to acquire it on a modern system since most people use 64-bit OSes – Govind Parmar Sep 03 '18 at 03:10
  • 1
    @PeterCordes you implied all assemblers since the beginning of time. Please re-write if you were meaning to limit to x86 assemblers in the last few years. – old_timer Sep 03 '18 at 03:17
  • 1
    @GovindParmar: oh good. I've never used it. I only know about it from SO questions and answers, but by all accounts it's horrible by modern standards. (because it hasn't been updated in 40 years). – Peter Cordes Sep 03 '18 at 03:23
  • @PeterCordes : That would be incorrect since the Windows XP version of debug is slightly different and it wasn't built 40 years ago. It isn't quite the same as the one shipped with DOS. One change was to file redirection. – Michael Petch Sep 03 '18 at 03:24
  • @old_timer: I thought I had sufficient weasel words that I wasn't claiming that every assembler in the world other than DEBUG.EXE used decimal for bare numeric literals. But since that wasn't clear to everyone, I rewrote it. – Peter Cordes Sep 03 '18 at 03:30
  • @GovindParmar freedos comes with a debug.exe like debugger. dosbox yes you have to get one. bochs, dont know, these are as easy to come by as masm32. – old_timer Sep 03 '18 at 03:34
  • @PeterCordes it still assumes a data set of a unknown size. The presence of these comments and this back and forth should cover the problem though. – old_timer Sep 03 '18 at 03:39
  • @MichaelPetch: I only mentioned debug.exe as an example of a crusty old assembler that doesn't default to decimal, because it's the only one I know of where that isn't the case. But I wouldn't recommend it to anyone over NASM. I updated my answer to explain more about why I brought it up only to dump on it and not recommend it for use *now*. Its actual assembler (source parsing) hasn't been updated, right? Just the output file format support? Doesn't really invalidate my point that it's not up to par compared with other freely available tools. – Peter Cordes Sep 03 '18 at 03:43
  • 1
    JFYI, the Turbo Debugger also takes values as hexa by default, when you enter addresses in "go to" in code/memory view, or when you assemble single instruction, but the value must still lead with 0-9 digit, i.e. `cmp al,ff` is invalid and will try to look for "ff" symbol, but `cmp al,0ff` will replace current opcode in code view. (I sometimes use dosbox+TD.EXE to quickly try few more variants of code that I'm writing in other window in text editor). – Ped7g Sep 03 '18 at 07:47
3

Be careful, understand that assembly languages are generally not standardized in the way that many higher level languages are, so the question is quite vague, you didnt even state the instruction set. The tag masm32 implied x86 (that tag was added for you).

It seems that you wanted x86 and the specific subset of the masm family of assemblers.

Assembly is generally defined by the assembler, the tool, not the instruction set. So when wanting to know how an assembly language works or its rules you have to look at the assembler itself. Its documentation if any or if good enough, if not you have to experiment.

I dont have masm32 handy, requires some pain to get it, but I have another readily available assembler and you could experimentally answer your own question. (as pointed out already in another answer, yes without the h in masm it defaults to decimal)

mov al,10h
mov al,0x10
mov al,10

which disassembles to

00000000  B010              mov al,0x10
00000002  B010              mov al,0x10
00000004  B00A              mov al,0xa

In this case not specified means defaults to decimal, which is what you should expect at least for instructions from masm.

Non-instruction syntax which is also part of the assembly language may have different syntax rules than the instruction part of the language. One would hope a tool uses the same rules for numbers throughout, but you never know.

Likewise there may be instructions that use an immediate as an offset to a register rather than a value being loaded into a register, one would hope those immediates/values also follow the same rules.

Best to experiment and be sure rather than hope that a manual or a web page is complete and correct.

To your title question, which is again very vague, yes there are assemblers out there that understand octal, decimal and hexadecimal (and maybe other bases like base 2) not necessarily all within one tool, and not limited to x86 since the title question did not. And what they default to and what syntax is required to specify a base is specific to each tool. Point being assembly language is not like other programming languages, cannot make generalizations about assembly language. Would be simple for someone to create a new assembler for some target that doesnt conform to the generalization, yet be a perfectly usable tool.

old_timer
  • 69,149
  • 8
  • 89
  • 168
  • You don't need all those "not necessarily" qualifiers. NASM does understand base 2, base 8, base 10, and base 16 for numeric literals, with the right suffix or prefix (your choice). [How to represent hex value such as FFFFFFBB in x86 inline assembly programming?](https://stackoverflow.com/a/37152498). – Peter Cordes Sep 03 '18 at 04:01
  • 1
    @PeterCordes : As a side note: MASM also supports the suffixes `d`, `o`, `b`, `h` (there are a few other aliased suffixes also supported) – Michael Petch Sep 03 '18 at 04:44
  • As another side note: EASM also supports decimal suffixes D, K, M, G for big numbers, see https://euroassembler.eu/eadoc/#IntNumbers – vitsoft Sep 04 '18 at 07:55