1

I've recently started looking into assembly code and I'm trying to recode some basic system functions to get a grip on it, I'm currently stuck on a segmentation fault at 0x0 on my strchr.

section .text
global strchr

strchr:
    xor rax, rax

loop:
    cmp BYTE [rdi + rax], 0
    jz end

    cmp sil, 0
    jz end

    cmp BYTE [rdi + rax], sil
    jz good

    inc rax
    jmp loop

good:
    mov rax, [rdi + rcx]
    ret

end:
    mov rax, 0
    ret

I can't figure out how to debug it using GDB, also the documentation I've came across is pretty limited or hard to understand.

I'm using the following main in C to test

extern char *strchr(const char *s, int c);

int main () {
   const char str[] = "random.string";
   const char ch = '.';
   char *ret;

   ret = strchr(str, ch);
   printf("%s\n", ret);
   printf("String after |%c| is - |%s|\n", ch, ret);

   return(0);
}
JFMR
  • 23,265
  • 4
  • 52
  • 76
C.Edelweiss
  • 151
  • 13
  • It seems as though you've got the character to look for in `sil` and you are using `rax` as the offset counter... but when you find the character in the string, `rcx` comes out of nowhere... – David Hoelzer Mar 18 '18 at 11:13

1 Answers1

5

The Problem

The instruction immediately following the good label:

mov rax, [rdi + rcx]

should actually be:

lea rax, [rdi + rax]

You weren't using rcx at all, but rax and, what you need is the address of that position, not the value at that position (i.e. lea instead of mov).


Some Advice

  1. Note that the typical idiom for comparing sil against zero is actually test sil, sil instead of cmp sil, 0. It would be then:

    test sil, sil
    jz end
    

    However, if we look at the strchr(3) man page, we can find the following:

    char *strchr(const char *s, int c);

    The terminating null byte is considered part of the string, so that if c is specified as '\0', these functions return a pointer to the terminator.

    So, if we want this strchr() implementation to behave as described in the man page, the following code must be removed:

    cmp sil, 0
    jz end 
    
  2. The typical zeroing idiom for the rax register is neither mov rax, 0 nor xor rax, rax, but rather xor eax, eax, since it doesn't have the encode the immediate zero and saves one byte respect to the latter.


With the correction and the advice above, the code would look like the following:

section .text
global strchr

strchr:
    xor eax, eax

loop:
    ; Is end of string?
    cmp BYTE [rdi + rax], 0
    jz end

    ; Is matched? 
    cmp BYTE [rdi + rax], sil
    jz good

    inc rax
    jmp loop

good:
    lea rax, [rdi + rax]
    ret

end:
    xor eax, eax
    ret
JFMR
  • 23,265
  • 4
  • 52
  • 76
  • @考えネロク Thanks a lot for the explanation! I've tried correcting it and it still gets a core dumped at 0x1 somehow. // Access not within mapped region. – C.Edelweiss Mar 18 '18 at 11:39
  • @C.Edelweiss Which arguments are you passing to `strchr` ? Are you checking the return value of the function before using it (i.e.: it can be `NULL`) ? – JFMR Mar 18 '18 at 11:44
  • @考えネロク I'm passing it a const char str[] = "random.string" with a const char c = '.' . The test main crashes before I can check any return. // I've edited the main post with my main.c that i'm using for testing – C.Edelweiss Mar 18 '18 at 11:48
  • 1
    The code of your last edit works for me, both for your assembly with just the `lea rax, [rdi + rax]` correction and for my assembly above. The problem seems to be somewhere else... – JFMR Mar 18 '18 at 11:54
  • 1
    my output is `.string String after |.| is - |.string| ` – JFMR Mar 18 '18 at 11:55
  • @考えネロク I'll look around for the issue, thanks a lot! – C.Edelweiss Mar 18 '18 at 11:59
  • 2
    I think `test sil, sil` / `jz end` should be removed entirely, not just hoisted out of the loop. `strchr(string, 0)` should work like strlen, rather than returning not-found. [The `strchr(3)` man page](http://man7.org/linux/man-pages/man3/strchr.3.html) even mentions this case: `if c is specified as '\0', these functions return a pointer to the terminator.` – Peter Cordes Mar 19 '18 at 20:53
  • 1
    Also, it would obviously be better to load the character into a register (with `movzx eax, byte [rdi]`) before testing it twice, because not all x86 CPUs have a load throughput of 2 loads per clock. (Of course, with the @C.Edelweiss [loop structure, with an extra `jmp` at the bottom instead of a `jnz`](https://stackoverflow.com/questions/47783926/why-are-loops-always-compiled-like-this), it bottlenecks at one iteration per 1.5 clocks even on Haswell.) I'd just increment `rdi` and load from `[rdi]`. You want to return a pointer, not a count, so you just need `mov rax, rdi` at the end. – Peter Cordes Mar 19 '18 at 21:01
  • 1
    @PeterCordes How right you are! The implementation was wrong (as clearly stated in the man page). Thank you!!! – JFMR Mar 19 '18 at 21:13