-1

So i made this code with the knowledge i gather in various sites, i think there is an optimazed way to do it without pushing and poping the registers on the stack memory, but i don't know how to do it. Here is my Code

comparing proc
    MOV CX, SIZEOF vec2 ;The size of the substring
    DEC CX
    MOV DX, SIZEOF vec1 ; The size of the String
    LEA SI, vec1        ; The String
    LEA DI, vec2        ; The substring

FIND_FIRST:        
    MOV AL, [SI];   storing the ascii value of current character of mainstring 
    MOV AH, [DI];   storing the ascii value of current character of substring
    CMP AL,AH;      comparing both character
    JE FITTING;      if we find it we try to find the whole substring
    JNE NEXT

NEXT:
    INC SI; We go to the next char
    DEC DX; And the size of the string decreased
    JE N_FINDED
    JMP FIND_FIRST
FITTING:
    CLD
    PUSH CX ; I push this register because in the instruction REPE CMPSB
    PUSH SI ; They change.
    PUSH DI
    REPE CMPSB
    JNE N_FITTING
    JE FINDED
N_FITTING:
    POP DI
    POP SI
    POP CX
    JMP NEXT ; if the complete substring doesn't fit we go to the next char
FINDED:
    POP DI
    POP SI
    POP CX
    MOV AL, 0001H;  substring found
    JMP RETURN


N_FINDED:    
    MOV AL, 0000H;  substring not found

RETURN:
    ret 
comparing endp
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • For what it's worth, the supine of "find" is "found". – tripleee Oct 04 '21 at 18:08
  • 1
    In terms of micro-optimizations, you can hoist `cld` and `push`/`pop` out of the loop, only saving/restoring around the whole function. Also, scanning for a matching first byte before starting a slow `rep cmpsb` would make a lot of sense. (In fact, on modern CPUs, `rep cmps` and `rep scas` aren't optimized with fast-strings microcode, only `rep stos` and `rep movs`, so generally avoid them entirely; [see this example](https://stackoverflow.com/q/55563598) of an old version of GCC that inlines `rep cmpsb` at `-O1`, being slow for large strlen especially vs. SSE2 SIMD. – Peter Cordes Oct 04 '21 at 18:18
  • 3
    In terms of algorithmic optimizations, for a long substring "needle", there are algorithms that can do better than just looking at every byte of the "haystack" as a possible start point. A famous one is Boyer-Moore. https://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string-search_algorithm. Or without really changing the brute-force strategy, to use `cmp [si], ax` or something to check for 2 matching bytes, with overlapping unaligned word compares. Are you optimizing for actual 8086, or for modern x86 in 16-bit mode? Or somewhere in between like 386 or Pentium? – Peter Cordes Oct 04 '21 at 18:19

1 Answers1

1

If the substring happens to have more than one character, which is very likely, then your code will start comparing bytes that are beyond the string to search through.
With the string referred to by DI and its length DX, and the substring referred to by SI and its length CX, you first need to make sure that neither string is empty, and then you need to limit the number of possible finds. Next 4 lines of code do just that:

    jcxz NotFound   ; Substring is empty
    sub  dx, cx
    jb   NotFound   ; Substring is longer than String
                    ; Happens also when String is empty
    inc  dx

As an example consider the string "overflow" (DX=8) and the substring "basic" (CX=5):

sub  dx, cx    ; 8 - 5 = 3
inc  dx        ; 3 + 1 = 4  Number of possible finds is 4

overflow

basic       possible find number 1
 basic      possible find number 2
  basic     possible find number 3
   basic    possible find number 4

You can write your proc without having to preserve those registers on the stack (or elsewhere) all the time. Just introduce another register so you don't have to clobber the CX, SI, and DI registers:

    jcxz NotFound
    sub  dx, cx
    jb   NotFound
    inc  dx

    mov  al, [si]       ; Permanent load of first char of the Substring
FirstCharLoop:
    cmp  [di], al
    je   FirstCharMatch
NextFirstChar:
    inc  di
    dec  dx             ; More tries?
    jnz  FirstCharLoop  ; Yes
NotFound:
    xor  ax, ax
    ret

FirstCharMatch:
    mov  bx, cx
    dec  bx
    jz   Found          ; Substring had only 1 character
OtherCharsLoop:
    mov  ah, [si+bx]
    cmp  [di+bx], ah
    jne  NextFirstChar
    dec  bx
    jnz  OtherCharsLoop
Found:
    mov  ax, 1
    ret

Do note that this code now does not compare the first character again like the repe cmpsb in your program did.
AX being the result, the only registers that are clobbered (and that you maybe could want to preserve) are BX, DX, and DI.

Sep Roland
  • 33,889
  • 7
  • 43
  • 76