Print a half pyramid of numbers in assembly

Question

I have to write a program in assembly which read a number and print a half pyramid of numbers.

i.e: read 4

print

I've understood how to read a number, how to use a loop for print one character per line, but i have to use inner loops and i don't know how to initialise another contor.

.286
.model small
.stack 1024h
.data

.code

mov cx,5
mov bx,5

cosmin:

mov dl,31h
mov ah, 2h
int 21h

mov dl, 0Ah
int 21h

loop cosmin

end

Here i've tried to make a triangle only for one character, but i don't know how to increment the values per each line.

.286
.model small
.stack 1024h
.data

.code

mov cx,5
mov bx,5

cosmin:

mov dl,31h
mov ah, 2h
int 21h

mov dl, 0Ah
int 21h

loop cosmin

end

Should the second source differ somehow? It looks same like first to me. — Ped7g, Jan 15 '17 at 12:12
You *have* to use an inner loop for each line. You can either save CX (push/pop) from the outer loop, or use some other register and simulate `loop` with `dec`+`jnz`. — Bo Persson, Jan 15 '17 at 12:25

score 3 · Answer 1 · answered Jan 15 '17 at 13:44

To create an inner loop with another loop instruction you must first save the state of the outer loop as both loop instructions use cx.
After the inner loop is due, you must restore said state.
Failing to do so will invalidate the outer loop's counter, particularly it will cycles forever.

You can save the outer loop counter pretty much wherever you want, however if you have a spare register, like bx, use it.
If you don't have such a register, you can always use the stack (push and pop) but that, besides introducing the constrain of keeping the stack balanced on every loop-break condition, it accesses the memory and make retrieving the outer counter more clumsy.

loop is not that great instruction, it is restrictive and slow.
Plus for the problem of yours counting up is better, so i'd totally avoid it in favor of a manually managed counter.

;Assume SI holds the number n (>=1) of rows of the pyramid

mov cx, 1                  ;Outer counter (i)

_rows_loop:
 mov bx, 1                 ;Inner counter (j)
__cols_loop:

  ;Print number bx

  inc bx               ;Increment inner counter
  cmp bx, cx           ;If (j<i) keep looping
 jb _cols_loop

 ;Print new-line

 inc cx                ;Increment outer counter
 cmp cx, si            ;If (i<n) keep looping
jb _rows_loop

If you run out of spare registers, you can always use the stack but beware that it complicates the loop-breaking code as everything you push must always be balanced by a pop. Plus it accesses memory.

In extreme cases you can use the 8-bit registers, using cl and ch for cx and bx in the code above will work for counters that fits.

I leave to you the problem of finding the algorithm to generate the pyramid.

How do i use SI register to read n? mov ah,01h int 21h sub al,48 mov si,al — Cosmin Baciu, Jan 18 '17 at 00:36
Thanks a lot for your explanation. I know that i have to study more. — Cosmin Baciu, Jan 18 '17 at 00:39
@CosminBaciu For numbers from 0 to 9 your approach is correct, except that `mov si, al` is invalid, use `xor ah, ah`,`mov si, ax` (or `movzx si, al` if you can afford newer instructions). — Margaret Bloom, Jan 18 '17 at 07:41

score 2 · Answer 2 · answered Jan 15 '17 at 13:28

I don't know how to increment the values per each line.

Well, per each line do inc where-the-value-is-stored (either have it in some spare register, or in memory, if you run out of spare registers).

Keep in head or in comments summary, what is your current allocation of registers, so you know which is spare or which is already used for something.

Make sure you get along the requirements of your external calls, as you can easily pick different register for your own code, but you can't change for example int 21h to take service number in bh, as it is already implemented by your DOS vendor to accept service number in ah. So either avoid ah usage for yourself, or use preserve/restore pattern (below).

Try to keep simple things simple, like incrementing value is inc. Assembly is actually quite good in this, if you keep clear image of what you want in your head, in terms of very simple numerical operations/steps, you can usually find a pretty straightforward and simple combination of ASM instructions doing exactly that and not much else. Quite often nothing else at all.

If you have hard time to map some desire to few assembly instructions in simple way, your high-level task is probably not broken enough into simple steps, so try to break it down a bit more, then try again find some short straightforward translation into instructions.

loop rel8 is one of the few more-complex x86 instructions, doing basically this:

dec  cx     (ecx in 32b mode, rcx in 64b mode)
jnz  rel8

But it will not affect flags (the dec + jnz happens internally as single specialized thing, not literally as two original dec + jnz instructions), and it is artificially slow on modern x86 CPUs to help a bit legacy SW which was using empty loop $ loops to create "delays" (it's futile, as it's still way too fast for such SW, and it "removes" otherwise very nice opcode for future SW :/ ).

So you may want to prefer the actual two instruction "dec cx jnz rel8" combination for real programming, it will have better performance on modern x86 CPU.

In Assembly CPU registers are like "super globals", ie. there's single cx per CPU core (internally that's not true on modern x86, but this is how it looks to behave from the outside, from the point of view of programmer).

So if you need two different values in it, like counter1 and counter2, you will have to write additional extra code preserving the appropriate cx value where needed and loading the other one as needed.

For example two nested loops done by loop:

    mov cx,10
outer_loop:
    mov bx,cx    ; preserve outer counter in bx
    mov cx,5
inner_loop:
    ; some loop code
    loop inner_loop
    mov cx,bx    ; restore outer counter
    loop outer_loop

Or if you are short of spare registers, you can use stack, the "naive" way is:

    mov cx,10
outer_loop:
    push cx      ; preserve outer counter
    mov  cx,5
inner_loop:
    ; some loop code
    loop inner_loop
    pop  cx      ; restore outer counter
    loop outer_loop

(C++ compilers would resolve this in different way, allocating local variable in stack space, so instead of push/pop it would use the same memory spot by [sp+x] or [bp-x] directly, saving performance by not adjusting sp with every usage, like push/pop does)

But if you will take a look on previous part of my answer, you should be able to find different way how to solve nested loops with two counters - without additional preserve/restore instructions.

But that pattern of preserve/restore value in particular target register is something you have to fully understand and be able to use in all sorts of different situations (even if it is not needed for nested loops), for example if you will read documentation about ah=2, int 21h, you can see it cares only about ah and dl values (and modifies al). So for example dh is "spare".

Then if you want to output two characters: A and space, but you still want to end with A in the main "variable" (will be dl in next example), you can do this:

init_part:
    mov   dx,' '*256 + 'A'  ; dh = ' ', dl = 'A'
    mov   ah,2              ; output single char service
    ; some other init code, etc..

inner_part_somewhere_later:
    int   21h               ; output dl to screen (initially 'A')
    xchg  dl,dh             ; preserves old "dl" and loads "dh" into it (swaps them)
    int   21h               ; output dh to screen (space)
    xchg  dl,dh             ; restores 'A' in dl
    ; so *here* you can operate with 'dl'
    ; as some inner_part loop "variable"
    ; modifying it for another inner_part iteration

Finally if you have task like yours and the solution is not obvious, one of the reasoning steps can be "backtracking" what you want.

You know you want output on screen (<NL> = new line):

1<NL>
1 2<NL>

So imagine what that means on the bottom level, in the end. There are of course several ways how to achieve it (including writing whole lines, prepared in memory buffer, instead of single chars), but if I will stick with your single char output, this wanted output translates into this need:

To call int 21h, ah=2 with dl set to:
[49 (digit 1), 13 (carriage return), 10 (line feed), 49, 32 (space), 50 (digit 2), 13, 10].

That doesn't look obviously "loopable", but if you would add more lines, the pattern "digit + space" for inner loop would emerge. You can also "cheat" a bit, and output one useless space after last digit, as for normal user it will be "invisible". At that point you should be able to "backtrack" to this high level design:

char_per_line_count = 1
ending_char_count = 2
[lines_loop:
   char = '1'
   line_counter = char_per_line_count
   [chars_loop:
      int 21h,2 with char
      int 21h,2 with space
      loop to chars_loop while (--line_counter)]
   int 21h,2 with 13
   int 21h,2 with 10
   ++char_per_line_count
   loop to lines_loop while (char_per_line_count < ending_char_count)]

Now you can try to run it few times in your head, to verify the output is really what you want.

Once you have such high-level overview how you can achieve the desired output, you can start to looking in a way how to nicely implement particular steps of it.

If you did understand each previous part of this answer, I think it will be quite easy to rewrite this algorithm into ASM instructions. Just keep the comments in the code ahead of particular group of instructions.

Then when you are debugging because of some bug, you can easily compare what the code is really doing with the comment, what it should have did, find the discrepancy and fix it.

But all the time the main thing to compare your code against is that final output on screen defined, whenever you are stuck, compare your current output with the desired, find some discrepancy, assess which one looks the easiest one to fix, and try to fix it. If there's no more discrepancy found, you are sort of "done", although I would strongly suggest to take another long look on your code, whether it can't be simplified, and if it works correctly for corner-cases (like "what happens if user enters letter instead of digit").

It's not important to have code which does handle every corner case correctly, but you should know what will happen in every case, and decide if that is "good enough" or not (as a rule of thumb, "garbage in -> garbage out" is fine, "garbage in -> crash or damage of data" is not cool, "garbage in -> meaningful fix or error message is cool).

BTW, that high-level algorithm, in your particular case can be modified to remove `line_counter` completely, and do the loop control by using `char` value and different kind of loop condition. In ASM code that would result into trading `dec + jnz` for `cmp end_value + jnz` and using one register less (so then you have one more spare for other values). — Ped7g, Jan 15 '17 at 13:39

Print a half pyramid of numbers in assembly

2 Answers2