I don't know how to increment the values per each line.
Well, per each line do inc where-the-value-is-stored
(either have it in some spare register, or in memory, if you run out of spare registers).
Keep in head or in comments summary, what is your current allocation of registers, so you know which is spare or which is already used for something.
Make sure you get along the requirements of your external calls, as you can easily pick different register for your own code, but you can't change for example int 21h
to take service number in bh
, as it is already implemented by your DOS vendor to accept service number in ah
. So either avoid ah
usage for yourself, or use preserve/restore pattern (below).
Try to keep simple things simple, like incrementing value is inc
. Assembly is actually quite good in this, if you keep clear image of what you want in your head, in terms of very simple numerical operations/steps, you can usually find a pretty straightforward and simple combination of ASM instructions doing exactly that and not much else. Quite often nothing else at all.
If you have hard time to map some desire to few assembly instructions in simple way, your high-level task is probably not broken enough into simple steps, so try to break it down a bit more, then try again find some short straightforward translation into instructions.
loop rel8
is one of the few more-complex x86 instructions, doing basically this:
dec cx (ecx in 32b mode, rcx in 64b mode)
jnz rel8
But it will not affect flags (the dec + jnz
happens internally as single specialized thing, not literally as two original dec + jnz
instructions), and it is artificially slow on modern x86 CPUs to help a bit legacy SW which was using empty loop $
loops to create "delays" (it's futile, as it's still way too fast for such SW, and it "removes" otherwise very nice opcode for future SW :/ ).
So you may want to prefer the actual two instruction "dec cx
jnz rel8
" combination for real programming, it will have better performance on modern x86 CPU.
In Assembly CPU registers are like "super globals", ie. there's single cx
per CPU core (internally that's not true on modern x86, but this is how it looks to behave from the outside, from the point of view of programmer).
So if you need two different values in it, like counter1 and counter2, you will have to write additional extra code preserving the appropriate cx
value where needed and loading the other one as needed.
For example two nested loops done by loop
:
mov cx,10
outer_loop:
mov bx,cx ; preserve outer counter in bx
mov cx,5
inner_loop:
; some loop code
loop inner_loop
mov cx,bx ; restore outer counter
loop outer_loop
Or if you are short of spare registers, you can use stack, the "naive" way is:
mov cx,10
outer_loop:
push cx ; preserve outer counter
mov cx,5
inner_loop:
; some loop code
loop inner_loop
pop cx ; restore outer counter
loop outer_loop
(C++ compilers would resolve this in different way, allocating local variable in stack space, so instead of push/pop it would use the same memory spot by [sp+x]
or [bp-x]
directly, saving performance by not adjusting sp
with every usage, like push/pop
does)
But if you will take a look on previous part of my answer, you should be able to find different way how to solve nested loops with two counters - without additional preserve/restore instructions.
But that pattern of preserve/restore value in particular target register is something you have to fully understand and be able to use in all sorts of different situations (even if it is not needed for nested loops), for example if you will read documentation about ah=2, int 21h
, you can see it cares only about ah
and dl
values (and modifies al
). So for example dh
is "spare".
Then if you want to output two characters: A
and space, but you still want to end with A
in the main "variable" (will be dl
in next example), you can do this:
init_part:
mov dx,' '*256 + 'A' ; dh = ' ', dl = 'A'
mov ah,2 ; output single char service
; some other init code, etc..
inner_part_somewhere_later:
int 21h ; output dl to screen (initially 'A')
xchg dl,dh ; preserves old "dl" and loads "dh" into it (swaps them)
int 21h ; output dh to screen (space)
xchg dl,dh ; restores 'A' in dl
; so *here* you can operate with 'dl'
; as some inner_part loop "variable"
; modifying it for another inner_part iteration
Finally if you have task like yours and the solution is not obvious, one of the reasoning steps can be "backtracking" what you want.
You know you want output on screen (<NL>
= new line):
1<NL>
1 2<NL>
So imagine what that means on the bottom level, in the end. There are of course several ways how to achieve it (including writing whole lines, prepared in memory buffer, instead of single chars), but if I will stick with your single char output, this wanted output translates into this need:
To call int 21h, ah=2
with dl
set to:
[49
(digit 1), 13
(carriage return), 10
(line feed), 49
, 32
(space), 50
(digit 2), 13
, 10
].
That doesn't look obviously "loopable", but if you would add more lines, the pattern "digit + space" for inner loop would emerge. You can also "cheat" a bit, and output one useless space after last digit, as for normal user it will be "invisible". At that point you should be able to "backtrack" to this high level design:
char_per_line_count = 1
ending_char_count = 2
[lines_loop:
char = '1'
line_counter = char_per_line_count
[chars_loop:
int 21h,2 with char
int 21h,2 with space
loop to chars_loop while (--line_counter)]
int 21h,2 with 13
int 21h,2 with 10
++char_per_line_count
loop to lines_loop while (char_per_line_count < ending_char_count)]
Now you can try to run it few times in your head, to verify the output is really what you want.
Once you have such high-level overview how you can achieve the desired output, you can start to looking in a way how to nicely implement particular steps of it.
If you did understand each previous part of this answer, I think it will be quite easy to rewrite this algorithm into ASM instructions. Just keep the comments in the code ahead of particular group of instructions.
Then when you are debugging because of some bug, you can easily compare what the code is really doing with the comment, what it should have did, find the discrepancy and fix it.
But all the time the main thing to compare your code against is that final output on screen defined, whenever you are stuck, compare your current output with the desired, find some discrepancy, assess which one looks the easiest one to fix, and try to fix it. If there's no more discrepancy found, you are sort of "done", although I would strongly suggest to take another long look on your code, whether it can't be simplified, and if it works correctly for corner-cases (like "what happens if user enters letter instead of digit").
It's not important to have code which does handle every corner case correctly, but you should know what will happen in every case, and decide if that is "good enough" or not (as a rule of thumb, "garbage in -> garbage out" is fine, "garbage in -> crash or damage of data" is not cool, "garbage in -> meaningful fix or error message is cool).