0

The aim of the programme is to add the Hex data from ffff:0-ffff:b
The result of the add will be saved as dx

assume cs:code
code segment
mov ax,0ffffH   ;       set ds
mov ds,ax       ;   point to ffffH

mov ax,4000H    ;       set ss
mov ss,ax       ;   point to 4000H
mov bx,0000H    ;       reset bx
mov sp,0010H    ;       set ss:sp point to 4000:0010

push bx         ;       reset stack
mov cx,000cH    ;       set while number

s:              ;       while point
pop ax          ;       get add result
mov dl,[bx]     ;get the number which is in ffff:0-ffff:b to dl
sub dh,dh       ;set dh=00H
inc bx          ;the bx+1
add dx,ax       ;add result
push dx         ;save the last add result
loop s          ;jmp s:

mov ax,004cH    ;       programme use
int 21H         ;   int 21H to return

code ends
end
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Chris
  • 23
  • 2
  • [edit] your question to fix the formatting. Don't post more code as a non-answer. (Select your code and use the "code formatting" option in the editor menu, or use triple backticks.) – Peter Cordes Jul 02 '21 at 01:24
  • 2
    Optimize for what, code size? Performance on modern CPUs, like Skylake and Zen 2 in real mode? The `loop` instruction [is slow on Intel, fast on AMD](https://stackoverflow.com/questions/35742570/why-is-the-loop-instruction-slow-couldnt-intel-have-implemented-it-efficiently). And you'll want to avoid partial-register false dependencies, and hoist zeroing out of the loop. Also certainly don't `push` or `pop` in the loop, that's not useful unless you're trying to create a prefix-sum array on the stack. – Peter Cordes Jul 02 '21 at 01:28
  • Can you assume that SSE2 or MMX SIMD is available? If so, you can sum 16 unsigned bytes at a time with `psadbw` against a zeroed XMM register. So maybe load 16 bytes and `pslldq` byte-shift to keep the 0..b part you want. (Assuming true real mode, thus 64k segment limits, it's definitely safe to load all the way out to `ffff:000f`.) – Peter Cordes Jul 02 '21 at 01:30
  • Or did you want to optimize for performance on 8088? (In this case probably the same as code size, since pop/push inside the loop isn't doing anything useful so we can just remove that memory access.) – Peter Cordes Jul 02 '21 at 01:48
  • You asked about optimizing the "algorithm". I think the only possible optimizations are micro-optimizations, not algorithmic, unless you know something about the data. Your current algorithm (a loop) reads each element once, and does the minimum number of additions. Getting a speedup is going to involve doing the additions faster, not doing fewer additions or some other operation, unless you know that every value is the same for example, then you just load one and multiply. – Peter Cordes Jul 02 '21 at 02:00
  • 2
    Why are you setting up a stack (which you have done wrongly anyway; if an interrupt comes at the wrong place it's bad new time)? DOS provides your application with a perfectly fine stack on program start. No need to butcher that. – fuz Jul 02 '21 at 11:19

1 Answers1

1

Excessive memory use There's no need for the detour over the stack. This also slows down the code.

Redundant register use Currently you're controlling the loop through a separate loop-control-variable (CX), but you can just as well control the loop with the address-variable (BX) that is already available.

Reducing number of iterations Instead of doing 12 additions to the DX register that was cleared beforehand, you can load DX with one of the values and then only do 11 additions.

Below is one possible implementation.

Traversing the memory from top to bottom allows us to not have to cmp the address with some value. We can use the flags obtained from the dec of the address.

  mov bx, 000Ah
  mov dx, 0FFFFh
  mov ds, dx
  inc dx          ; Cheap way to zero DX in this case
  mov dl, [bx+1]  ; Load byte from FFFF:000B
More:
  add dl, [bx]
  adc dh, bh      ; BH=0 throughout the address range
  dec bx
  jns More

  mov ax, 4C00h   ; DOS.TerminateWithReturnCode
  int 21h

Please notice that your program purely by chance was able to exit to DOS with your instructions mov ax,004cH int 21H. From the mention of 4cH it should be clear that the intention was to exit via DOS.TerminateWithReturnCode function 4Ch, but through a lucky inversion you got the old and deprecated DOS.TerminateProgram function 00h.

Sep Roland
  • 33,889
  • 7
  • 43
  • 76