2

org 0x7c00 is the normal way to get correct absolute addresses in a flat binary, but I was curious about a different way which I expected to work.

I tried using section boot vstart=0x7c00 align=1 to tell YASM the right memory address, with symbol in another section that uses start=300.

mov [symbol+$$], register

yasm -fbin boot.asm gives error: effective address too complex on that line.

From my understanding, symbol+$$ should be able to be processed into a number (instead of a segment+offset), right? If I am wrong, please tell me, but if I am right then why does YASM tell me that the address is too complex?

Is there another way to use start= and/or vstart= instead of org and still get correct absolute addressing?

Using [symbol] doesn't work; that assembles to an absolute address of [0000]


The reason why I wanted to do this, is because I have binary machine code for a boot loader that relocates itself, but it stores a few values in some symbols before it relocates, (for example, the boot drive which is passed in dl)

YASM supports a binary program with "sections" that can have different addressing offsets, So what I did was I had the code setup where the MBR was the first 300 bytes of the first sector, the variables were stored after the 300 bytes and before the 446th byte, I wanted to use this method so that I can use variables that are technically from other sections, but get copied relative to the current sections offset.

Here is a simplified example of what I am trying to do:

; example.asm
; yasm -fbin example.asm

%define virtual(_name, _offset) section _name vstart=_offset align=1
%define absolute(_name, _offset) section _name start=_offset align=1

virtual(boot, 0x7c00) ; Virtual Offset of 0x7c00 (in-file offset of 0)
start:
    ; This is just an example
    ; There isn't going to be much here.
    mov [boot_drive+$$], dl

    cli 
    hlt

absolute(vars, 300) ; Virtual AND in-file offset of 300

boot_drive db 0
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • The assembler doesn't know the value for either of them. Relocation entries only support a single symbol. Why do you even want to add `$$`? – Jester Nov 23 '19 at 17:05
  • @Jester I forgot to mention that this is for binary format, not relocatable. – WolfHybrid23 Nov 23 '19 at 17:06
  • 1
    Please post [mcve]. You might also want to explain what problem you are trying to solve. – Jester Nov 23 '19 at 17:13
  • Looks like that should work, but indeed it doesn't. Given this is 16 bit code, the simple workaround is to just change segment register when relocating so all your offsets remain unchanged. – Jester Nov 23 '19 at 17:55
  • 1
    You're not doing yourself any favours playing around with sections like this. Boot sectors can much more simply and clearly written using the ORG directive. – Ross Ridge Nov 23 '19 at 18:37
  • For the record, NASM 2.14.02's error message is more explanatory: `error: invalid effective address: multiple base segments`. But like Ross said, use `org 0x7c00` - that's what it's for. – Peter Cordes Nov 23 '19 at 18:52
  • Have you considered looking at how I did this in the answer to your other question? https://stackoverflow.com/a/59010176/3857942 . I use an org at the top, and use your macro just to place the partition and the boot sector in the proper place. I also take advantage of the fact that _DL_ isn't modified anywhere in the code I wrote so its value when running the VBR is the same as when the MBR started running. I can modify that answer to show how you can implement a variable like `boot_drive` if you really wish to see how that could work. – Michael Petch Nov 23 '19 at 20:06
  • I know, it would be easier to read my code if I did it that way, and I decided I will do it that way. The problem here is no longer about *the way* I want to implement it, but about the way it should work, but doesn't. @PeterCordes I am not using NASM, I am using YASM (Note the Y). And even if I were using NASM this should in theory still work sense the expression can and should be evaluated to a constant by the processor or preprocessor (whichever one handles that) before it gets assembled. – WolfHybrid23 Nov 23 '19 at 20:11
  • 1
    YASM was based on NASM, they are rather similar although NASM has more features. However both don't happen to support what you are doing (at least not in the way you are doing it). They should, but they don't. But you can rework the code to avoid the problem altogether. For instance, *after* relocating the boot sector save DL to boot_drive and use ORG 0x0600 for the bootloader rather than ORG 0x7c00 – Michael Petch Nov 23 '19 at 20:14
  • @MichaelPetch I already said in my comment that I choose to go with the way you did things. The only reason I made this post was to see if it was possible to do this because I was curious. – WolfHybrid23 Nov 23 '19 at 20:15
  • @WolfHybrid23: Yes, I know the difference between YASM and NASM. They're mostly syntax-compatible, even down to using the same macro directives. That's why I used NASM to see what error message *it* would produce. You're right that there's a possibly-interesting question here about what that expression should evaluate to, and why YASM and NASM refuse to eval it. – Peter Cordes Nov 23 '19 at 20:19
  • 1
    I made a major edit to the first part of your question to say you already know about `org`, and still ask about how YASM syntax for works. – Peter Cordes Nov 23 '19 at 20:32
  • 1
    I should add there's one place that defining a section could be useful in your code, and that would be to move your uninitialized variables out of your bootsector. Since they're uninitialized they don't need to take up space in your bootsector and can be placed anywhere. You can do this with a "nobits" section, which is a section that doesn't appear in the output. Since it can't have any actual contents you'd allocate space with the RESB family of directives. eg `section vars nobits vstart=0x800` `boot_drive RESB 1`. You can access `boot_drive` directly without adding anything. – Ross Ridge Nov 23 '19 at 23:53
  • 1
    @RossRidge : if going that route I'd recommend looking at the ABSOLUTE directive. – Michael Petch Nov 24 '19 at 00:15

2 Answers2

3

Your basic problem is that you're not actually adding two numbers, you're adding two symbols, and assemblers don't generally allow this. This is because object file formats don't have any way to represent the addition of two symbols as a relocation, and that's because it doesn't really make much sense to add two symbols. While in this case you're generating a binary file which doesn't support relocations, and so the assembler could invent its own virtual relocations that handle this, apparently this hasn't been implemented in YASM as an exception to the general rule.

Why assemblers don't allow adding symbols

The reason why the addition of two symbols doesn't make sense in the general case, when object files may be generated, is that symbols are more than just numbers. They also refer to a section, and sections can end up living anywhere in memory. Your [bootdrive + $$] expression is saying to take the actual address of bootdrive as loaded in memory, and add it to the the actual address of the start of the current section. When generating object files an assembler will have no idea what these actual addresses will be, the sections the symbols belong to could be put anywhere. Even the linker may not know, if it's generating a relocatable executable, it will depend on where the operating system loads the executable.

(This ignores the fact that you've told the assembler that bootdrive should be treated as having a different actual address than assembler would otherwise think it would have. This also something that your assembler doesn't support in the usual case of outputting an object file.)

Binary files could be an exception, but aren't

Now, in the case of generating a binary file, there's no linker involved, so YASM could know that bootdrive has an "actual" address of 300 and that $$ has an actual address of 0x7c00. But this would require that the assembler make an exception when evaluating effective addresses, one it would it have to propagate to the backend that generates binary files. That exception hasn't been implemented in your assembler, and you may have a hard time convincing the YASM (or NASM) developers to do so.

Your difficulty convincing them would come from the fact that even with binary files it doesn't really make sense to add two symbols, even if you could. Your example code would only work because the address of bootdrive isn't its actual address. Indeed, the reason why you're adding $$ to it is to calculate its actual address. Since your example use case is contrived and unnecessary, there are better ways to write a bootloader that relocates itself, it doesn't make a good argument for why it can make sense to add two symbols.

There's probably no direct workaround

As for a workaround, I can't really think of any direct solution that would still involve using bootloader and $$. When someone tries to add two symbols there's often a way it can be rewritten in a form that works, often by subtracting two symbols. Subtracting two symbols that are in the same section is supported by assemblers, as it removes the common section from the equation. So for example, [foo + bar_begin - bar_end] could be written as [foo + (bar_begin - bar_end)]. However I'm not sure what there is that you can subtract from bootloader and $$ to remove either of their sections from the equation.

While I'm sure there's some other way of solving your problem that would still let you accomplish what you want using the section directives you're using, I'm not going to bother trying to figure out what that might be. Instead I'm going to suggest a workaround that you've said you don't want, if not for your own benefit then for the benefit of others that might come to this post in a similar situation.

My solution, even it's not what you want

My solution is to not use section directives to solve the problem of a bootsector living at two different address. Instead you can use an ORG that reflects where the majority of the code lives after being copied. The small amount of code that needs to be executed at the original location can easily be made position independent so it doesn't care what ORG is used.

The following is the framework of a self-relocating MBR boot block. Most of the code necessary for implementing an MBR has been left out for brevity.

    BITS    16

RELOC_OFFSET EQU 0x600

    ORG RELOC_OFFSET

start:
    xor ax, ax
    mov ds, ax
    mov es, ax
    mov ss, ax
    mov sp, 0x7c00

    mov di, RELOC_OFFSET
    mov si, 0x7c00
    mov cx, 512 / 2
    cld
    rep movsw
    jmp 0:relocated_entry

relocated_entry:
    mov [boot_drive], dl
    ; ...
    mov dl, [boot_drive]
    jmp 0:0x7c00

boot_drive DB   0

    TIMES   446 - ($ - $$) DB 0
partition_table:
    DB  0x80, 0x01, 0x00, 0x05, 0x17, 0x01, 0x03, 0x01, 0x04, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00
    ; ...

    TIMES   510 - ($ - $$) DB 0
    DB  0x55, 0xaa

The key thing here is that boot_drive is only accessed after the code has been moved. There's no need to save DL any earlier because the initial code doesn't need to change DL. Indeed it may be possible eliminate saving DL altogether as generally its not necessary to modify DL in an MBR bootsector. The TIMES directive is used to ensure that partition table and magic number are where they need be.

Here's the output of objdump -D -b binary -m i8086 -M intel --adjust-vma=0x600:

 600:   31 c0                   xor    ax,ax
 602:   8e d8                   mov    ds,ax
 604:   8e c0                   mov    es,ax
 606:   8e d0                   mov    ss,ax
 608:   bc 00 7c                mov    sp,0x7c00
 60b:   bf 00 06                mov    di,0x600
 60e:   be 00 7c                mov    si,0x7c00
 611:   b9 00 01                mov    cx,0x100
 614:   fc                      cld    
 615:   f3 a5                   rep movs WORD PTR es:[di],WORD PTR ds:[si]
 617:   ea 1c 06 00 00          jmp    0x0:0x61c
 61c:   88 16 29 06             mov    BYTE PTR ds:0x629,dl
 620:   8a 16 29 06             mov    dl,BYTE PTR ds:0x629
 624:   ea 00 7c 00 00          jmp    0x0:0x7c00
    ...
 7bd:   00 80 01 00             add    BYTE PTR [bx+si+0x1],al
 7c1:   05 17 01                add    ax,0x117
 7c4:   03 01                   add    ax,WORD PTR [bx+di]
 7c6:   04 00                   add    al,0x0
 7c8:   00 00                   add    BYTE PTR [bx+si],al
 7ca:   04 00                   add    al,0x0
    ...
 7fc:   00 00                   add    BYTE PTR [bx+si],al
 7fe:   55                      push   bp
 7ff:   aa                      stos   BYTE PTR es:[di],al
Ross Ridge
  • 38,414
  • 7
  • 81
  • 112
  • "Binary files could be an exception, but aren't" -- In the NASM architecture, flat binary output is just another output format for the frontend, which is linked internally by the backend. That is, there is a linker and an intermediate object format there, albeit both internal to the assembler. That is why bin output doesn't allow fancy relocations which could be uniquely supported by it, eg https://sourceforge.net/p/nasm/feature-requests/176/ – ecm Nov 24 '19 at 15:48
2

YASM seems rather flakey, but NASM seems to be a bit more sane when using $-$$ in a critical expression (like a section's start option). After playing with this for a bit, I have a reason not to use YASM.

If you are willing to use NASM it looks like you may be able to get this to work. I removed the macros to show how it could be done with basic section directives:

; example.asm
; nasm -fbin example.asm -o example.bin

BOOT_ORG       EQU 0x7c00
BOOT_RELOC_ORG EQU 0x0600

section unreloc start=0x0000 vstart=BOOT_ORG align=16
start:
    ; This is just an example
    ; There isn't going to be much here.
    mov [boot_drive], dl
    mov al, [partition_1]
    ; Do relocation code here

    jmp 0x0000:reloc_start

section reloc follows=unreloc vstart=BOOT_RELOC_ORG+($-$$) align=1
reloc_start:
    mov [boot_drive], dl
    mov al, [partition_1]
    cli
    hlt

section vars start=300 vstart=300+BOOT_ORG align=1
boot_drive db 0

section parttbl start=446 vstart=446+BOOT_ORG align=1
partition_1:
dq 0x80, 0
partition_2:
dq 0, 0
partition_3:
dq 0, 0
partition_4:
dq 0, 0

section bootsig start=510 vstart=510+BOOT_ORG align=1
dw 0xaa55

When I use ndisasm -b16 -o0x7c00 example.bin I get this output which seems correct:

00007C00  88162C7D          mov [0x7d2c],dl
00007C04  A0BE7D            mov al,[0x7dbe]
00007C07  EA0C060000        jmp word 0x0:0x60c
00007C0C  88162C7D          mov [0x7d2c],dl
00007C10  A0BE7D            mov al,[0x7dbe]
00007C13  FA                cli
00007C14  F4                hlt
00007C15  0000              add [bx+si],al
00007C17  0000              add [bx+si],al
00007C19  0000              add [bx+si],al
00007C1B  0000              add [bx+si],al
00007C1D  0000              add [bx+si],al
00007C1F  0000              add [bx+si],al
00007C21  0000              add [bx+si],al
00007C23  0000              add [bx+si],al
00007C25  0000              add [bx+si],al
00007C27  0000              add [bx+si],al
00007C29  0000              add [bx+si],al
00007C2B  0000              add [bx+si],al
00007C2D  0000              add [bx+si],al
00007C2F  0000              add [bx+si],al
00007C31  0000              add [bx+si],al
00007C33  0000              add [bx+si],al
00007C35  0000              add [bx+si],al
00007C37  0000              add [bx+si],al
00007C39  0000              add [bx+si],al
00007C3B  0000              add [bx+si],al
00007C3D  0000              add [bx+si],al
00007C3F  0000              add [bx+si],al
00007C41  0000              add [bx+si],al
00007C43  0000              add [bx+si],al
00007C45  0000              add [bx+si],al
00007C47  0000              add [bx+si],al
00007C49  0000              add [bx+si],al
00007C4B  0000              add [bx+si],al
00007C4D  0000              add [bx+si],al
00007C4F  0000              add [bx+si],al
00007C51  0000              add [bx+si],al
00007C53  0000              add [bx+si],al
00007C55  0000              add [bx+si],al
00007C57  0000              add [bx+si],al
00007C59  0000              add [bx+si],al
00007C5B  0000              add [bx+si],al
00007C5D  0000              add [bx+si],al
00007C5F  0000              add [bx+si],al
00007C61  0000              add [bx+si],al
00007C63  0000              add [bx+si],al
00007C65  0000              add [bx+si],al
00007C67  0000              add [bx+si],al
00007C69  0000              add [bx+si],al
00007C6B  0000              add [bx+si],al
00007C6D  0000              add [bx+si],al
00007C6F  0000              add [bx+si],al
00007C71  0000              add [bx+si],al
00007C73  0000              add [bx+si],al
00007C75  0000              add [bx+si],al
00007C77  0000              add [bx+si],al
00007C79  0000              add [bx+si],al
00007C7B  0000              add [bx+si],al
00007C7D  0000              add [bx+si],al
00007C7F  0000              add [bx+si],al
00007C81  0000              add [bx+si],al
00007C83  0000              add [bx+si],al
00007C85  0000              add [bx+si],al
00007C87  0000              add [bx+si],al
00007C89  0000              add [bx+si],al
00007C8B  0000              add [bx+si],al
00007C8D  0000              add [bx+si],al
00007C8F  0000              add [bx+si],al
00007C91  0000              add [bx+si],al
00007C93  0000              add [bx+si],al
00007C95  0000              add [bx+si],al
00007C97  0000              add [bx+si],al
00007C99  0000              add [bx+si],al
00007C9B  0000              add [bx+si],al
00007C9D  0000              add [bx+si],al
00007C9F  0000              add [bx+si],al
00007CA1  0000              add [bx+si],al
00007CA3  0000              add [bx+si],al
00007CA5  0000              add [bx+si],al
00007CA7  0000              add [bx+si],al
00007CA9  0000              add [bx+si],al
00007CAB  0000              add [bx+si],al
00007CAD  0000              add [bx+si],al
00007CAF  0000              add [bx+si],al
00007CB1  0000              add [bx+si],al
00007CB3  0000              add [bx+si],al
00007CB5  0000              add [bx+si],al
00007CB7  0000              add [bx+si],al
00007CB9  0000              add [bx+si],al
00007CBB  0000              add [bx+si],al
00007CBD  0000              add [bx+si],al
00007CBF  0000              add [bx+si],al
00007CC1  0000              add [bx+si],al
00007CC3  0000              add [bx+si],al
00007CC5  0000              add [bx+si],al
00007CC7  0000              add [bx+si],al
00007CC9  0000              add [bx+si],al
00007CCB  0000              add [bx+si],al
00007CCD  0000              add [bx+si],al
00007CCF  0000              add [bx+si],al
00007CD1  0000              add [bx+si],al
00007CD3  0000              add [bx+si],al
00007CD5  0000              add [bx+si],al
00007CD7  0000              add [bx+si],al
00007CD9  0000              add [bx+si],al
00007CDB  0000              add [bx+si],al
00007CDD  0000              add [bx+si],al
00007CDF  0000              add [bx+si],al
00007CE1  0000              add [bx+si],al
00007CE3  0000              add [bx+si],al
00007CE5  0000              add [bx+si],al
00007CE7  0000              add [bx+si],al
00007CE9  0000              add [bx+si],al
00007CEB  0000              add [bx+si],al
00007CED  0000              add [bx+si],al
00007CEF  0000              add [bx+si],al
00007CF1  0000              add [bx+si],al
00007CF3  0000              add [bx+si],al
00007CF5  0000              add [bx+si],al
00007CF7  0000              add [bx+si],al
00007CF9  0000              add [bx+si],al
00007CFB  0000              add [bx+si],al
00007CFD  0000              add [bx+si],al
00007CFF  0000              add [bx+si],al
00007D01  0000              add [bx+si],al
00007D03  0000              add [bx+si],al
00007D05  0000              add [bx+si],al
00007D07  0000              add [bx+si],al
00007D09  0000              add [bx+si],al
00007D0B  0000              add [bx+si],al
00007D0D  0000              add [bx+si],al
00007D0F  0000              add [bx+si],al
00007D11  0000              add [bx+si],al
00007D13  0000              add [bx+si],al
00007D15  0000              add [bx+si],al
00007D17  0000              add [bx+si],al
00007D19  0000              add [bx+si],al
00007D1B  0000              add [bx+si],al
00007D1D  0000              add [bx+si],al
00007D1F  0000              add [bx+si],al
00007D21  0000              add [bx+si],al
00007D23  0000              add [bx+si],al
00007D25  0000              add [bx+si],al
00007D27  0000              add [bx+si],al
00007D29  0000              add [bx+si],al
00007D2B  0000              add [bx+si],al
00007D2D  0000              add [bx+si],al
00007D2F  0000              add [bx+si],al
00007D31  0000              add [bx+si],al
00007D33  0000              add [bx+si],al
00007D35  0000              add [bx+si],al
00007D37  0000              add [bx+si],al
00007D39  0000              add [bx+si],al
00007D3B  0000              add [bx+si],al
00007D3D  0000              add [bx+si],al
00007D3F  0000              add [bx+si],al
00007D41  0000              add [bx+si],al
00007D43  0000              add [bx+si],al
00007D45  0000              add [bx+si],al
00007D47  0000              add [bx+si],al
00007D49  0000              add [bx+si],al
00007D4B  0000              add [bx+si],al
00007D4D  0000              add [bx+si],al
00007D4F  0000              add [bx+si],al
00007D51  0000              add [bx+si],al
00007D53  0000              add [bx+si],al
00007D55  0000              add [bx+si],al
00007D57  0000              add [bx+si],al
00007D59  0000              add [bx+si],al
00007D5B  0000              add [bx+si],al
00007D5D  0000              add [bx+si],al
00007D5F  0000              add [bx+si],al
00007D61  0000              add [bx+si],al
00007D63  0000              add [bx+si],al
00007D65  0000              add [bx+si],al
00007D67  0000              add [bx+si],al
00007D69  0000              add [bx+si],al
00007D6B  0000              add [bx+si],al
00007D6D  0000              add [bx+si],al
00007D6F  0000              add [bx+si],al
00007D71  0000              add [bx+si],al
00007D73  0000              add [bx+si],al
00007D75  0000              add [bx+si],al
00007D77  0000              add [bx+si],al
00007D79  0000              add [bx+si],al
00007D7B  0000              add [bx+si],al
00007D7D  0000              add [bx+si],al
00007D7F  0000              add [bx+si],al
00007D81  0000              add [bx+si],al
00007D83  0000              add [bx+si],al
00007D85  0000              add [bx+si],al
00007D87  0000              add [bx+si],al
00007D89  0000              add [bx+si],al
00007D8B  0000              add [bx+si],al
00007D8D  0000              add [bx+si],al
00007D8F  0000              add [bx+si],al
00007D91  0000              add [bx+si],al
00007D93  0000              add [bx+si],al
00007D95  0000              add [bx+si],al
00007D97  0000              add [bx+si],al
00007D99  0000              add [bx+si],al
00007D9B  0000              add [bx+si],al
00007D9D  0000              add [bx+si],al
00007D9F  0000              add [bx+si],al
00007DA1  0000              add [bx+si],al
00007DA3  0000              add [bx+si],al
00007DA5  0000              add [bx+si],al
00007DA7  0000              add [bx+si],al
00007DA9  0000              add [bx+si],al
00007DAB  0000              add [bx+si],al
00007DAD  0000              add [bx+si],al
00007DAF  0000              add [bx+si],al
00007DB1  0000              add [bx+si],al
00007DB3  0000              add [bx+si],al
00007DB5  0000              add [bx+si],al
00007DB7  0000              add [bx+si],al
00007DB9  0000              add [bx+si],al
00007DBB  0000              add [bx+si],al
00007DBD  00800000          add [bx+si+0x0],al
00007DC1  0000              add [bx+si],al
00007DC3  0000              add [bx+si],al
00007DC5  0000              add [bx+si],al
00007DC7  0000              add [bx+si],al
00007DC9  0000              add [bx+si],al
00007DCB  0000              add [bx+si],al
00007DCD  0000              add [bx+si],al
00007DCF  0000              add [bx+si],al
00007DD1  0000              add [bx+si],al
00007DD3  0000              add [bx+si],al
00007DD5  0000              add [bx+si],al
00007DD7  0000              add [bx+si],al
00007DD9  0000              add [bx+si],al
00007DDB  0000              add [bx+si],al
00007DDD  0000              add [bx+si],al
00007DDF  0000              add [bx+si],al
00007DE1  0000              add [bx+si],al
00007DE3  0000              add [bx+si],al
00007DE5  0000              add [bx+si],al
00007DE7  0000              add [bx+si],al
00007DE9  0000              add [bx+si],al
00007DEB  0000              add [bx+si],al
00007DED  0000              add [bx+si],al
00007DEF  0000              add [bx+si],al
00007DF1  0000              add [bx+si],al
00007DF3  0000              add [bx+si],al
00007DF5  0000              add [bx+si],al
00007DF7  0000              add [bx+si],al
00007DF9  0000              add [bx+si],al
00007DFB  0000              add [bx+si],al
00007DFD  0055AA            add [di-0x56],dl

I personally wouldn't use this approach as it can be more easily done using the method I discussed in my previous answer about creating a relocatable bootloader which is rather similar to Ross Ridge's answer to this question.

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
  • 1
    The reason I was using YASM was because I thought sections in NASM were only for relocatable formats, thanks for clearing that up! After seeing this, I do agree that there is not really a reason to use YASM. – WolfHybrid23 Nov 23 '19 at 22:25