3

I try to make a program which takes some input, finds odd positions in the string and prints corresponding character, so you enter 'somewords' and it prints 'oeod'.
I ended up making a loop which iterates through the string, then divides the counter by 2 and prints the character in counter's position if remainder isn't equal 0.

Instead of single character it prints nothing.

Full code:

SECTION .bss
inp: resb 255

SECTION .data
msg db "Enter the string: ", 0h

SECTION .text
global _start

_start:
    mov    eax, msg
    call   stprint 

    mov    edx, 255  ; take user input 
    mov    ecx, inp 
    mov    ebx, 0 
    mov    eax, 3 
    int    80h 

    call   findodd

    mov    ebx, 0
    mov    eax, 1
    int    80h

findodd:
    push   eax
    push   ecx
    push   edx
    push   esi
    push   ebx

    mov    ecx, 0     ; counter
    mov    esi, 2     ; divider

.iterstring:  
    mov    eax, inp           ; move input to eax
    cmp    byte [eax+ecx], 0  ; check for end of the string in position
    je     .finish            ; if equal, finish
    inc    ecx  

    push   eax
    mov    eax, ecx   ; move counter to eax 
    xor    edx, edx   ; divide it by 2
    idiv   esi  
    pop    eax
    cmp    edx, 0     ; check the remainder
    jnz    .printchar ; print character if != 0
    jmp    .iterstring

.printchar:  
    push   eax
    push   ebx
    movzx  ebx, byte [eax+ecx] ; move single byte to ebx

    push   ecx
    mov    ecx, ebx  ; move ebx to print
    mov    edx, 1    ; print the character
    mov    ebx, 1
    mov    eax, 4
    int    80h

    pop    ecx
    pop    eax
    pop    ebx
    jmp    .iterstring  

.finish:  
    pop    eax  
    pop    ecx   
    pop    edx
    pop    esi
    pop    ebx
    ret  

; print string function (taken from tutorial)
; if I try to print single character with it I get SEGFAULT
stprint:
    push    edx
    push    ecx
    push    ebx
    push    eax
    call    stlen

    mov     edx, eax
    pop     eax

    mov     ecx, eax
    mov     ebx, 1
    mov     eax, 4
    int     80h

    pop     ebx
    pop     ecx
    pop     edx
    ret

stlen:
    push    ebx
    mov     ebx, eax

nextch:
    cmp     byte [eax], 0
    jz      finish
    inc     eax
    jmp     nextch

finish:
    sub     eax, ebx
    pop     ebx
    ret

I tried to use bl, al and cl with no luck. I also tried to make some checks. For example, print the counter in .iterstring:

nasm -f elf lr3.asm && ld -m elf_i386 lr3.o -o lr3 && ./lr3
Enter the string: test
1
2
3
4
5

So it seems like iteration works fine.

The most luck I've got with answer for similar question (How to print a character in Linux x86 NASM?) making such changes to the code:

.printchar:
  push   eax
  push   ebx
  push   esi
  mov    eax, inp
  movzx  ebx, byte [eax+ecx]
  mov    esi, ecx ; see below

  push   ecx
  push   ebx
  mov    ecx, esp
  mov    edx, 1    ; print the character
  mov    ebx, 1
  mov    eax, 4
  int    80h

  pop    ecx
  pop    ebx
  pop    eax
  pop    ebx
  mov    ecx, esi  ; without this it just prints 1 character and ends
  pop    esi       ; so ecx is not restored with pop for some reason?
  jmp    .iterstring

But it prints everything except for first characters:

nasm -f elf lr3.asm && ld -m elf_i386 lr3.o -o lr3 && ./lr3
Enter the string: somewords
mewords        

I'm stuck and can't get what are my mistakes.

Edit, final code:

findodd:
    push   eax
    push   ecx
    push   edx
    push   esi
    push   ebx
    mov    esi, 0     ; counter
    mov    eax, inp

.iterstring:
    inc    esi
    cmp    byte [eax+esi], 0
    jz     .finish
    test   esi,1
    jz     .iterstring

    movzx   ecx, byte [eax+esi]
    push    ecx
    mov     ecx, esp
    mov     edx, 1
    mov     ebx, 1
    push    eax
    mov     eax, 4
    int     80h
    pop     eax
    pop     ecx
    jmp     .iterstring

.finish:
    pop    eax
    pop    ecx
    pop    edx
    pop    esi
    pop    ebx
    ret

Now it works as intended:

nasm -f elf lr3.asm && ld -m elf_i386 lr3.o -o lr3 && ./lr3
Enter the string: somewords
oeod

I had to remove more push-pop instructions and also had to move counter into esi because pushing and then popping the registers not always restored their value in the stack, which is weird for me.
When I tried to move into ecx address of byte [eax+ecx] it worked, but when I changed it to byte [eax+1] it would segfault because restoring eax after pop would break. When I pushed ecx to print out a message and then popped it back, it ended up with segfault and gdb showed that there was rubbish code inside ecx after popping it out.
With current code it works fine though.

Aaron
  • 33
  • 3
  • 1
    Are you using a 64 bit kernel? – Joshua May 20 '19 at 03:40
  • 1
    @Joshua that would only be a problem if it was built without `CONFIG_IA32_EMULATION`, e.g. on WSL (Windows Subsystem for Linux). Those nasm + ld commands will create a 32-bit executable that will run in 32-bit mode. So kernels without IA32 support actually wouldn't run it at all. A 64-bit kernel is not a possible explanation for the symptoms, because the OP provided a good [mcve], not just "doesn't work" :) – Peter Cordes May 20 '19 at 08:16
  • Your `iterstring` loop would be more efficient if you use `jz .iterstring` at the bottom, instead of jumping over a `jmp` with the opposite condition. That's just needlessly overcomplicated vs. an idiomatic `do{}while()` style of loop. Also using `div` to divide by a power of 2 is horrible: use AND to get the remainder, i.e. the low bit(s). Or better, test for odd/even with `test al, 1` to directly check the low bit of EAX. – Peter Cordes May 20 '19 at 08:24
  • Or even more simply, unroll your loop by 2 so instead of branching to alternate between 2 behaviours, you just do them sequentially. First just check for a 0 terminator (even bytes), then for the odd bytes, check and print. Then the loop repeats. You don't need to copy the data anywhere, just have ECX pointing to the array. (Although it would be much more efficient to copy the odd characters to a tmp array and make one `write` system call for the whole output. You could do that efficiently with SSE2 `psrlw xmm0, 8` / `packuswb` to pack the odd bytes from 2 16-byte vectors into one. – Peter Cordes May 20 '19 at 08:30
  • Thank you, @peter-cordes. I know only basics but will dig down sse2, thanks for pointing out. – Aaron May 20 '19 at 11:07
  • If you're interested in learning SIMD, this looks like a fun use-case for it, but if you're struggling with basics like pointer vs. data then thinking about processing 16 bytes per instruction might be tricky. Or maybe you'll find it makes perfect sense and become a wizard at manually vectorizing tricky problems that compilers suck at. :P If you want a simpler challenge, remove as many push/pop and `mov` instructions as possible, so your code still loops 1 byte at a time but with more compact simple efficient code. See also https://stackoverflow.com/tags/x86/info – Peter Cordes May 20 '19 at 11:30
  • Another fun exercise would be to re-implement this with copying to a tmp buffer and making a single `sys_write` call. One `write` system call is at least a few thousand times more expensive than the rest of what's in your loop, even with that horrible `div`, on modern x86 with mitigation for Spectre and Meltdown enabled. – Peter Cordes May 20 '19 at 11:32

1 Answers1

3

This line is incorrect:

mov    ecx, ebx  ; move ebx to print

write (int 80h / eax=4) expects ecx to contain the address of the data to write (see this table). But you're passing the data itself.

In your modified code you're placing the character on the stack and then passing its address in ecx, so that's correct. However, you've already incremented ecx by the time you get to .printchar. That's why your code doesn't print the first character.

As a side note, your check for even/odd numbers is unncessarily complicated. It could be simplified to just:

test ecx,1      ; set EFLAGS based on ecx AND 1
jnz .printchar
Michael
  • 57,169
  • 9
  • 80
  • 125