How to print a triangle of stars in assembly?

Question

I need to get the following output:

*
**
***
****
*****
******
*******
********
*********
**********

So its 10 rows,and my stars will start at 1 and go to 10.

Currently I am getting:

**********
***********
************
*************
**************
***************
****************
*****************
******************
*******************
********************

My code:

section .data

char db ' '
trianglesize db 0;      ;stars per line
trianglerows db 10;

section .text
global _start
_start

mov rax, [trianglerows] ;rows
outer_loop:
    mov rbx, [trianglerows]
    inner_loop:
    call star
    dec bx
    cmp bx,0
    jg inner_loop
call newline
call down_triangle
dec ax
cmp ax, 0
jne outer_loop
call newline
call exit

exit:
  mov eax,1 ;sys_exit
  mov ebx,0     ;return 0
  int 80h;
  ret

newline:
  mov [char],byte 10
  push rax;
  push rbx;
  mov eax,4;    ;sys_write
  mov ebx,1;    ;stdout
  mov ecx, char;
  mov edx,1;    ;size of new line
  int 80h

  pop rbx;
  pop rax;
  ret

star:
  mov [char], byte '*';
  push rax;
  push rbx;
  mov eax,4;    ;sys_write
  mov ebx,1;    ;stdout
  mov ecx, char;
  mov edx,1;
  int 80h;
  pop rbx;
  pop rax;
  ret

down_triangle:
  push rax;
  push rbx;

  mov rax, [trianglerows]
  inc ax
  mov [trianglerows],rax

  pop rbx
  pop rax
  ret

I tried and tried and tried but I couldn't get what I needed to get.

I seem to be unable to find a way to separate the rows from the lines of stars, because of all those push and pop.

Honestly, I do not understand these much. I've been told that they are needed to execute the loops, but I am not sure why, for example, in the function star I would need to call the outer loop.

I couldn't find any combination of push and pop that worked. I am constantly getting either many stars or one star per line or just one star.

I am literally puzzled at which bits I'm changing and keeping the same. I was able to get the required output but one that never ended increasing.

I was able to get output starting from 10 stars and going down to one, but never what I wanted.

What am I doing wrong? How do I do this question?

Use debugger to see yourself what bits you are changing and where. Use [instruction reference guide](http://www.felixcloutier.com/x86/) to read about every instruction you use, to cross-check against things you observe in debugger. Don't worry if you don't fully understand everything on first try, keep re-reading and re-checking. Plus try to find somewhat better tutorials/books for a start. You can start with this [MASM tutorial](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html), then NASM/YASM has some differences in syntax, but that is covered in NASM docs: http://www.nasm.us/doc/ — Ped7g, Mar 17 '18 at 23:12
And BTW, if you are cheeky bastard, you can output that 10 row triangle with single string defined as `starline: db '**********',10` (that's 10 stars and new-line char), outputting (with `sys_write`) from address starline+9 at first row to starline+0 at last row, only 2 chars at first row, then 3 chars, etc... ... or for more fun you can just generate whole multi-line string into memory first and then output it in single write, etc... there are millions of ways. Keep trying. — Ped7g, Mar 17 '18 at 23:16
@Ped7g This really isn't my first try I explained that. I've been trying to do this for most of this week. This is all that was given to me to work with and understand. I am using a Linux terminal to do this, what debugger is available from there..? Also I HAVE to implement this using loops. I am looking for realistic help here, what exactly causes my code to malfunction? — Papbad, Mar 17 '18 at 23:19
Also now I did notice you are using `int 0x80` in 64b mode, that's illegal abuse of linux kernel benevolence https://stackoverflow.com/q/46087730/4271923 so your *"all that was given to me"* contains either wrong info, or you didn't follow it properly. Plus it's ridiculous to follow only your official sources, when trying to learn to code in assembly. Use whatever you can find (and you will also find lot of BS, so keep running in large circles and check several things at the same time, filter out unpromising/low quality stuff). Debugger: `gdb` is text based (will take time to learn to control) — Ped7g, Mar 17 '18 at 23:23
Realistically it's impossible to help you, because this task is so trivial, that either you are willing to learn to code in assembly, then head toward those tutorials and books and keep reading and learning and experimenting... or you want somebody else to write it for you, then hire a coder. There's not much to go wrong on two loops, as long as you have some basic idea what you are doing. There's no other way to get anything working in assembly, it's not like high level languages, that you can randomly edit this and that and keep trying until it works, that's impossible in asm. — Ped7g, Mar 17 '18 at 23:25
use this for many more pointers and resources: https://stackoverflow.com/tags/x86/info .... EDIT: BTW, to be at least of some help, I will tell you at least the first instruction where you have bug in your code after executing it. The first bug is at `mov rax, [trianglerows]` (will load 8 bytes value instead of single byte defined, use rather `movzx eax,byte [trianglerows]` to zero-extend 8 bit value into 32 bit `eax`, which will automatically clear upper 32 bits of `rax`, because that's how x86_64 works). ... that didn't take long... hm. — Ped7g, Mar 17 '18 at 23:27
@Ped7g I just want to understand what I am doing. Also, I was finally able to get the result I wanted (why do I always get it only after asking on overflow, whichout even help of anyone here...). I changed the line in the inner loop to 'cmp bx, 9'. But I really dont understand why I couldnt store this value in a register. For example, I tried to do var1 db 9 in section .data, then do mov eax, var1 before loops to eventually say cmp bx, var1. Why was I unable to do that? My program would keep printing when I wrote this. How come? Why am I not able to store a value normally? — Papbad, Mar 17 '18 at 23:44
Did you read about the first bug I pointed out? Did you understand it? Probably not. `bx` is low 16 bits (two bytes) of `rbx` (64 bit), so if you have bogus values in upper 48 bits of `rbx`, using just `bx` will "save" you and make it work. As you are stubbornly defining those values as bytes, why don't you use the 8 bit registers for them? - *"I just want to understand what I am doing"* - so did you read through that tutorial? Was it clear? Are you now studying `gdb` controls? Etc? Your code does maybe output what was desired, but it's full of bugs and inaccuracies, so keep *learning*... — Ped7g, Mar 18 '18 at 00:46
Your code is at least 5x larger than a nice simple version should be. Your choices of which block should be a separate function makes things inconvenient for you. For example, you have a big function that just prints one `*`, instead of a whole line. `sys_write` naturally takes a variable number of characters, and system calls are expensive, so you should get in the habit of passing multiple characters whenever possible. — Peter Cordes, Mar 18 '18 at 01:08
@Ped7g's clever idea of decrementing a pointer back from the end of a string ending with `...***\n` is very cool, but you could write a loop to store some number of `*` chars and then a newline into a buffer and `write` that. You don't need anything in memory other than the buffer you will write; x86-64 has lots of registers. Loading and storing static storage and pushing/popping registers is what's making your code more complicated. Just let system calls clobber `rax`, and `rbx` if you want. (`syscall` instead of `int 0x80` wouldn't even use `rbx`, BTW.) — Peter Cordes, Mar 18 '18 at 01:10
Okay I agree there is many things I dont understand. Unfortunately I wasn't given enough material, and this is one of a few things I had to do as part of a class. I was given a sample code and it just used a lot of push and pull, so I was taking it from that code even though it may have multiple errors in it. Indeed I need to stretch myself and I will through trial and error. Unfortunately it tends to be that you understand the basics through higher stuff, or you're forced to understand the basics through that. Thanks for everyones' help. — Papbad, Mar 18 '18 at 15:05
@Papbad yeah, it's very difficult with limited or misleading resources, it's not like I don't understand why you made this question and effort, I was just trying to warn you, that at this moment you are still at beginning, before grasp of basics and there's lot of work ahead of you. Then again grasping basic principles of assembly will give you great insight into how computer works, and which kind of tasks are easier/more difficult to solve with it, and how to do it efficiently. But you should explore it for a year or two at least (like every week at least one day). And daily now, at start. — Ped7g, Mar 18 '18 at 15:25

Sharon Minsuk · Answer 1 · 2018-03-18T00:06:16.167

5

Your first row has 10 stars because you are using [trianglerows] in your inner loop. I'm sure you intended to use [trianglesize] (which currently you aren't using anywhere). Then in down_triangle, you'll want to increment, again, [trianglesize] rather than [trianglerows]. Finally, you probably want [trianglesize] to start with 1 rather than 0, for 1 star in the first row.

Also, be sure to correct your memory usage as described by Michael Petch in the comments below, otherwise your variables are being corrupted because they share the same memory.

edited Mar 18 '18 at 00:06

answered Mar 17 '18 at 23:31

Sharon Minsuk

119
5

1

He also has to fix the problem with using labels defined with 1 byte of data and reading 8 bytes. This will be very important if he starts using `trianglesize`. – Michael Petch Mar 17 '18 at 23:43
@MichaelPetch Interesting. The last time I used assembly was in 1989, and it was 6502! ;-) But I found this an interesting puzzle and had what seemed like the logical answer, so thought it was worth contributing. The devil's in the details, though! Sounds like you are saying that if he reads from `trianglesize`, he'll be reading data that also contains the "10" from `trianglerows`, and thus he still won't get the right number of stars. (Maybe something like 2561 stars in the first row? In other words hex A01 if `trianglesize` contains 1.) Correct? – Sharon Minsuk Mar 17 '18 at 23:59
1

Correct, so he either should be making those memory locations quadwords (8 byte values) or modifying his code to read and process the single byte values. – Michael Petch Mar 18 '18 at 00:01
It's still not full fix, as he's also using the wrong system calls (for example in win10 linux box it would crash, while proper 64b linux binary would print triangle correctly), etc... it's not even worth to fix his code properly, as it has so many weak spots, but your question did narrow down the main issue with wrong amount of stars. But the OP needs to learn some assembly first and get some decent lector or book, you can't fix a drowning boat with 50 holes by patching two of them... :/ Not that simple, unfortunately. (as always I sound very negative. OP: it's ok to be where you are, push!) – Ped7g Mar 18 '18 at 00:51
2

@Ped7g : Well the system call issue is secondary. Since the OP isn't using addresses outside the range that can be represented in 32-bits the code will work if IA32 emulation is in the kernel with `int 0x80`. It is not preferred but it will work within a given set of constraints.. Since the OP is getting output one can assume that he's on a 64-bit system that has support for IA32 emulation. So Sharon focusing on the semantic and logical issues will have more value to to the OP. OP can focus on that and then deal with cleaning things up. – Michael Petch Mar 18 '18 at 01:00

TigerTV.ru · Answer 2 · 2018-03-25T20:56:32.203

2

I solved the problem this way, it's 32-bit:

bits 32
global _start

section .data
    rows dw 10

section .text
_start:
movzx ebx, word [rows] ; ebx holds number of rows, used in loop

; here we count how many symbols we have
lea eax, [ebx+3]
imul eax,ebx
shr eax,1 ; shr is used to divide by two
; now eax holds number of all symbols
mov edx, eax ; now edx holds number of all symbols, used in print

;we prepare stack to fill data
mov ecx,esp
sub esp,edx

;we fill stack backwards
next_line:
    dec ecx 
    mov [ecx],byte 10
    mov eax,ebx
    next_star:
        dec ecx
        mov [ecx],byte '*'
        dec eax
        jg next_star
    dec ebx
    jg next_line

;print ; edx has number of chars; ecx is pointer on the string
mov eax,4;  ;sys_write
inc ebx;    ;1 - stdout, at the end of the loop we have ebx=0
int 80h;

;exit
mov eax,1       ;1 -  sys_exit
xor ebx,ebx     ;0 - return 0
int 80h;
ret

How did I do it?
First of all, I count number of symbols what we have to print. I'll print it all at once. It's the sum of a finite arithmetic progression(arithmetic series).

In our case we have

a1=2andd=1

We see 3 operations +, * and /. We can optimise only the division by 2, doing right shift:

lea eax, [ebx+3] ; n + 3
imul eax,ebx ; n * (n + 3)
shr eax,1 ; n * (n+3) / 2

Our string will be on the stack, let's prepare it to have enough memory:

mov ecx,esp
sub esp,edx

And then, we fill our stack by stars and \ns

next_line:
    dec ecx 
    mov [ecx],byte 10
    mov eax,ebx
    next_star:
        dec ecx
        mov [ecx],byte '*'
        dec eax
        jg next_star
    dec ebx
    jg next_line

I fill it backwards. What does it mean? I fill the string by symbols from the end to the beginning. Why do I do that? Just because I want to use less registers as it possible. At the end of the loop ecx contains a pointer on the string what we want to print. If I filled forwards, ecx contains a pointer on esp before "stack prepairing", and I can't use the register as a pointer on string in print function. Also I have to use another register to decrement or use cmp which is slower than dec.

That's all, print and end.

Another case

global _start

section .data
    rows dw 10

section .text
_start:

;it defines how many symbols we have to print
movzx ebx, byte[rows] ; ebx holds number of rows
lea eax,[ebx+3]
imul eax,ebx 
shr eax,1 ; now eax holds number of all symbols
mov edx,eax ; now edx holds number of all symbols, used in print

;prepare pointer
mov ecx,esp
sub ecx,eax ; ecx points on the beginning of the string, used in print

;fill the string by stars
mov eax,edx
shr eax,2
mov ebp, dword '****'
next_star:
    mov [ecx+4*eax],ebp
    dec eax
    jge next_star

;fill the string by '\n'
mov edi,esp
dec edi
mov eax,ebx; in the eax is number of rows
inc eax
next_n:
    mov [edi],byte 0xa
    sub edi,eax
    dec eax
    jg next_n

;print
;mov ecx,esp
mov eax,4;  ;sys_write
mov ebx,1;  ;1 - stdout 
int 80h;

;exit
mov eax,1       ;1 -  sys_exit
xor ebx,ebx     ;0 - return 0
int 80h;
ret

Here, at the beginning we fill the stack by stars and only after that we fill it by \ns

https://github.com/tigertv/stackoverflow-answers

edited Mar 25 '18 at 20:56

answered Mar 18 '18 at 08:54

TigerTV.ru

1,058
2
16
34

Your program depends on the upper 2 bytes of `eax` and `ecx` being zero on entry to `_start`. This is the case on Linux for a static executable only, but isn't guaranteed by the ABI, so you should definitely note that with a comment. Or use `movzx ecx, word [trianglerowchars]` like a normal person. Or better, `xor ecx,ecx` to zero ecx outside the outer loop, and don't use any static storage for counters. (Use an `equ` constant for `trianglerows`, and `mov eax, trianglerows`.) – Peter Cordes Mar 18 '18 at 13:17
3

And use registers that the int 0x80 ABI doesn't use, so you don't have to save/restore. Also please never suggest [the slow `loop` instruction](https://stackoverflow.com/questions/35742570/why-is-the-loop-instruction-slow-couldnt-intel-have-implemented-it-efficiently), especially when the code in the question doesn't use it. You wouldn't recommend old weird instructions like `xlatb`, and `loop` is (unfortunately) like that: a complex slow instruction that's only useful when optimizing for code-size over speed. – Peter Cordes Mar 18 '18 at 13:19
@PeterCordes: Thanks, I haven't known that `loop` is slow function. Replaced `eax` and `ebx`. I used `xor` and `movzx` – TigerTV.ru Mar 18 '18 at 18:18
Using the `dw 10` number of rows as scratch space for writing one character at a time is weird. You should definitely comment the first instruction that sets each register to say what that register is doing. Other than that, this is a pretty good implementation if you still limit yourself to only `write`ing one byte at a time, instead of generating a line in memory and then writing it, or the whole star. (e.g. on the stack for variable size allocation) – Peter Cordes Mar 19 '18 at 02:03
@PeterCordes: OK. Now the string to print is on the stack. – TigerTV.ru Mar 19 '18 at 14:38
In that case, your comment about porting to 64-bit by only changing eax to rax is totally bogus. The stack is outside the low 32 bits, so [`write` with the 32-bit `int 0x80` ABI will return `-EFAULT`](https://stackoverflow.com/questions/46087730/what-happens-if-you-use-the-32-bit-int-0x80-linux-abi-in-64-bit-code). In any case, you only needed to widen registers that were holding pointers, not `eax`. Also, you're still loading a byte from the `dw` you allocated for `rows:`. So your code will unexpectedly use `rows % 256` when you'd expect it to take the full value up to 65535. – Peter Cordes Mar 19 '18 at 18:41
@PeterCordes: I did it simpler. – TigerTV.ru Mar 19 '18 at 19:13
You forgot to `sub esp, space_for_stars`. Your program will fault if the buffer space you use is larger than the args + environment variables on the stack above your initial `esp`. That's only a few kiB, and could be much smaller if run with a cleared `env`. It's also poor style, because scribbling over stack space above `esp` doesn't work in a function that wants to `ret`. Use `n*(n+1) / 2` or whatever to calc how much space you'll need. Or `(n+1)*(n+2)/2 - 2` because you're printing a newline on every row. – Peter Cordes Mar 19 '18 at 19:33
Yes, i've got fault when I set 500 in rows. I used `n(n+3)/2` for all symbols because `n(n+3)/2 = n+n(n+1)/2`. – TigerTV.ru Mar 19 '18 at 19:44
You should also comment your code to point out that it generates the triangle "backwards", starting with the `\n` in the longest row, because it's starting with a pointer to the end of the buffer it's going to `write()`. That's only obvious to people who already know asm well enough to just read your code without comments. – Peter Cordes Mar 19 '18 at 19:47
@PeterCordes: Added comments and some description. Do I need to use `sysenter` instead of `int 80h` for 32-bit? – TigerTV.ru Mar 20 '18 at 19:27
No, you don't. It's not recommended (or even well supported; the ABI isn't stable) to use `sysenter` manually; if you want to for better performance, the recommended way is to `call` into the VDSO exported by the kernel. https://blog.packagecloud.io/eng/2016/04/05/the-definitive-guide-to-linux-system-calls/. This is only simple this with a dynamically linked binary. – Peter Cordes Mar 20 '18 at 19:31
1

It only takes 3 instructions to do `n * (n+3)/2`, using `lea` / `imul` / `shr`. [See my answer on a triangle-matrix question](https://stackoverflow.com/questions/49165711/algorithm-of-addressing-a-triangle-matrices-memory-using-assembly/49176021#49176021), where I could optimize away the `shr` because I wanted `n*(n+1)/2 * 4` to index dwords. Note that 2-operand `imul` is faster than `mul`; always prefer it unless you want to zero EDX (with operands known to be small) or are optimizing for code-size over performance. – Peter Cordes Mar 21 '18 at 08:44
@PeterCordes: you used `r*(r+1) * 2` for 4 bytes, but here is one. It can be useful if i feel a register like `****`. Every `mov` can be replaced by `lea`, which of them is faster? – TigerTV.ru Mar 22 '18 at 22:18
1

Yes, that's why you still need `shr`, like I said in the first sentence, The element size of 4 is why I was able to optimize it away, but you can't. It's still an optimization to use `lea` and `imul`, though, over your original with 4 instructions including `mul`. `lea` replaces your `mov eax,ebx` / `add eax,3` with `lea eax, [ebx+3]`. – Peter Cordes Mar 22 '18 at 22:21

How to print a triangle of stars in assembly?

2 Answers2