3

I have an issue with my 8086 assembler I am writing. The problem is with the assembler passes.

During pass 1 you calculate the position relative to the segment for each label.

Now to do this the size of each instruction must be calculated and added to the offset.

Some instructions in the 8086 should be smaller if the position of the label is within a range. For example "jmp _label" would choose a short jump if it could and if it couldn't it would a near jump.

Now the problem is in pass 1 the label is not yet reached, therefore it cannot determine the size of the instruction as the "jmp short _label" is smaller than the "jmp near _label" instruction.

So how can I decided weather "jmp _label" becomes a "jmp short _label" or not?

Three passes may also be a problem as we need to know the size of every instruction before the current instruction to even give an offset.

Thanks

btlog
  • 4,760
  • 2
  • 29
  • 38
NibbleBits
  • 124
  • 7
  • I thought of a possible solution let me know if you agree? – NibbleBits Jan 01 '17 at 20:06
  • I thought of a possible solution let me know if you agree? What I could do is have it take a guess and choose a short jump. After its finished calculating the size for the segment it will then trace back and see if their was a mistake if their was it would correct every instructions offset? This would be slow but would work, if anyone has a better way please let me know – NibbleBits Jan 01 '17 at 20:07
  • I'm writing an assembler for the 8086. As I have wrote a compiler that generates to assembly from there an assembler will create machine code. – NibbleBits Jan 01 '17 at 20:26
  • The folks here can help you. http://board.flatassembler.net If you do get a chance, talk to Tomasz and ask him. – TheRealChx101 Jan 01 '17 at 20:42
  • Thank you for that link. – NibbleBits Jan 01 '17 at 20:46
  • Do you really need to write yet another x86 assembler? There are plenty of free and good ones: NASM, YASM, FASM, GNU assembler... In my [Smaller C](https://github.com/alexfru/SmallerC) compiler I'm letting NASM to do the work (and I can configure the compiler to use YASM and in some cases FASM). – Alexey Frunze Jan 02 '17 at 02:00
  • Are you not reading my comments? I am writing a compiler that generates to assembly language. I need to assemble this. It would be stupid to just have it pass to an external assembler. – NibbleBits Jan 02 '17 at 03:17
  • It's not necessarily stupid to pass it to an external assembler. My compiler allows asm(string_literal) statements, whose text goes out to the external assembler and so I get a cheap inline assembly support for free. And that's very handy. I use it in my C library when invoking DOS, Windows and Linux system calls. – Alexey Frunze Jan 02 '17 at 05:33

1 Answers1

3

What you can do is start with the assumption that a short jump is going to be sufficient. If the assumption becomes invalid when you find out the jump distance (or when it changes), you expand your short jump to a near jump. After this expansion you must adjust the offsets of the labels following the expanded jump (by the length of the near jump instruction minus the length of the short jump instruction). This adjustment may make some other short jumps insufficient and they will have to be changed to near jumps as well. So, there may actually be several iterations, more than 2.

When implementing this you should avoid moving code in memory when expanding jump instructions. It will severely slow down assembling. You should not reparse the assembly source code either.

You may also pre-compute some kind of interdependency table between jumps and labels, so you can skip labels and jump instructions unaffected by an expanded jump instruction.

Another thing to think about is that your short jump has a forward distance of 127 bytes and that when the following instructions amount to more than 127 bytes and the target label is still not encountered, you can change the jump to a near jump right then. Keep in mind that at any moment you may have up to 64 forward short jumps that may become near in this fashion.

Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180
  • I suggested a simular thing in a comment on my own post, your post is a much more indepth solution thank you. I too thought to myself the moment I change one jump it could affect the others, your completely right. I am glad someone agrees with me, this must mean I am on the right track. Much appreciated :) – NibbleBits Jan 02 '17 at 03:23
  • I don't suppose you can help with my other question can you? Nobody has answered and its been about a month: http://stackoverflow.com/questions/41022380/omfobject-module-format-length-field-appears-incorrect – NibbleBits Jan 02 '17 at 03:27
  • keeping track of open possible-near jumps shouldn't take too much time, for it's usually only a hand-full (as Alexey mentions < 64 ), and it can prevent you from moving blocks of memory. I would prefer this option over a (maybe huge) cross-linked related-jumps-list. And you can predict: if 2 jumps are open, and already `128-2x(number of jumps open)` bytes are used, both will be long – Tommylee2k Jan 02 '17 at 15:13