3

I'm reading NASM documentation and stuck upon the following code in section 3.2.5 TIMES: Repeating Instructions or Data

buffer: db      'hello, world' 
        times 64-$+buffer db ' '

They say this code will store exactly enough spaces to make the total length of buffer up to 64. Unfortunately, I didn't get it at all. The expression 64-$+buffer which is supposed to return a number seems very suspicious. So I want someone to explain the semantics if I didn't get right. My knowledge isn't enough to print the resulting number nor to check if the space was allocated as intended. Here is how I tried to de-parse it:

  1. 64-$+buffer is an arithmetic expression returning a number
  2. $ is a current location which should be equal to 13
  3. buffer is a labeled location and it equals to 0 if it's the very beginning of the section .data. Otherwise, we quickly get a negative number (which I suppose isn't what intended here).

If the above is true, then we get a buffer filled by 64 space characters where the first 12 is hello, world. Am I right?

Timur Fayzrakhmanov
  • 17,967
  • 20
  • 64
  • 95
  • 2
    I think it’s easier to understand if it’s written as `64 – ($ – buffer)`. `$ – buffer` is the number of bytes already placed, so 64 minus that is the number of spaces needed to pad out the buffer so the entire size is 64 bytes. – prl Dec 28 '19 at 23:28

2 Answers2

3

Yes, you are right, the $ symbol is basically the current target address while assembling. Let's look at some example values:

buffer: db 'hello, world' 
        times 64-$+buffer db ' '

We'll start by setting buffer to some arbitrary address like 27. The 12 characters for the message run then from 27 thru 38 inclusive so $ will be 39 following that.

The times count will then be (64 - 39 + 27) or 52, and that plus the 12 characters total 64.

So yes, assuming your string is less than 64 characters, it will be padded out with enough spaces to make 64 in total (if it's longer than 64, you'll probably get an assembler error because you're supplying a negative count).

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
2

Yes, it pads to 64 bytes past the label buffer, effectively doing something like char buffer[64] = "hello, world" in C except it pads with spaces instead of '\0' zero bytes. So you have a fixed-size buffer of spaces with a db string at the start, and you can change the string without affecting the total buffer size.


The times 64-$+buffer is the same thing as times (64+buffer) - $. It can be looked at as two addresses that get subtracted to get a byte count:

  • 64+buffer is where we want to be after padding: 64 bytes past the label buffer.
  • $ is where we are now, the current output position/address.
    (How does $ work in NASM, exactly?)

64+buffer - $ is thus how many bytes of padding we need, so using this as a times repeat count for on a db ' ' will get us there.


Sometimes it helps understanding to look at how it would break if you changed something:

If we'd used times ... db 1,2 or something, it would repeat that 2-byte sequence that many times, padding twice as much as it should. The thing we use times on has to be exactly 1 byte, which db ' ' is because it's a single-byte string/character constant.

(If you did want to pad with a repeating pattern of more than 1 byte, we could have used something like times (64-$+buffer + 1)/2 dw 0xabcd. The (x+1)/2 formula does division by 2 rounding up instead of down.)


The idiom is most often seen in legacy BIOS boot sectors to place the "signature" magic number in the last 2 bytes of the 512-byte file. simple boot sector coding: Filling the 512 Byte with 0 uses the classic times 510-($-$$) db 0 which is algebraically the same thing; pad to 510 bytes from the start of the section ($$).


If your before-padding string/code is too large, the times expression becomes negative and you get an error message. (e.g. How to fix "os.asm:113: error: TIMES value -138 is negative" in assembly language)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • The last trick is indeed amazing but it took time (until now) to figure it out :) Sorry, I wanted to make your answer as an accepted but another one is by far more accessible for newbies from my point of view. – Timur Fayzrakhmanov Dec 28 '19 at 18:37
  • 1
    @TimurFayzrakhmanov: that's fine, accept the answer that you find more helpful, and/or that you think is more readable for future readers that google the same question. It's not like being not-accepted makes it invisible. BTW, I thought my answer would be easier to understand than @ pax's by just breaking down the algebra without getting bogged down with numbers. So it's useful feedback to hear that wasn't the case. Maybe in future answers I'll use more detailed examples. Or maybe not; having different styles of answers to the same question isn't a bad thing. – Peter Cordes Dec 29 '19 at 00:28
  • 1
    @TimurFayzrakhmanov: I made some edits to maybe make it more beginner-friendly. I'm curious what you as a beginner think of it. I think the example of how it would break if you changed the `db` might help understanding how the pieces fit together. (I'm not expecting you to change your accept vote, I'm just wondering if this looks clearer than it did before to you.) – Peter Cordes Dec 29 '19 at 02:20
  • @PeterCorder: thank you for such a high concern! After scrutinizing and rereading all the additions -- (unfortunately) I can't tell you it has become any clearer. It feels like you're talking to someone who is *already* in context of all these low-level stuff. You've made 4 digressions: С-example at the beginning, repeating >1 byte patterns, formula for rounding up (which might overflow if `$` isn't aligned by 2) and leaving 2 bytes at the end of boot sector. All the tricks are really worth mentioning (and I'm happy I was able to digest them all and I knew C a priori). – Timur Fayzrakhmanov Dec 30 '19 at 09:56
  • 1
    But the original OP's concern wasn't solved until the first horizontal line WITHOUT any additional prerequisites. Also, the verb `pads out to`, as being not a native speaker, is difficult to understand. Finally, @prl gave the tip (as a comment) which I found kinda KEY to my misunderstanding: `64 - $ + start_addr` is the same as `64 - ($ - start_addr)` but AFTER reducing the algebra. Yes, you've made a similar connection in BIOS-digression showing `510-($-$$)` but it was difficult to align it up. – Timur Fayzrakhmanov Dec 30 '19 at 09:57
  • @TimurFayzrakhmanov: Thanks for the feedback. The key point of my answer is to break down the algebra, I just did it differently than @ prl. prl's `64 – bytes_already_placed` works too, but I found it more natural to think in terms of start and end address. I added an hrule separator between that section and the section about how it would break if you changed something (that was the point of the `dw` section: looking at what would happen if you changed something can help in understanding how the original works.) – Peter Cordes Dec 30 '19 at 10:20
  • And BTW, I though of overflow in the `(x+1)/2` but that's only possible if the address of the last byte of the `dw` wraps around. Or if NASM internally uses 64-bit integer math then we're definitely fine. Also, before linking there isn't a real "address", just a difference between `$` and a label. So you'd only possibly have a problem if you had something like `times 1<<31 dw 0` inside the padding region, assuming NASM was "only" doing 32-bit math. – Peter Cordes Dec 30 '19 at 10:24
  • I get that this might be a lot to take in but I try to highlight the key points in my answers so people can stop after understand that if they want. I guess I need to remember to always do that even in short answers. Or my answers just aren't your style; that's fine; a lot of readers do like my level of detail so I'm not going to change that. – Peter Cordes Dec 30 '19 at 10:27