How do I efficiently do signed comparisons on the 8080?

Question

I want to compare two 16-bit numbers and branch on the result: the equivalent of if (a<b) goto negative. I'm using an Intel 8080.

The Z80 has a signed arithmetic overflow flag which can be used for this with some degree of effort. The standard code is:

ld de, _left
ld hl, _right
ld a, e
sub a, l
ld a, d
sbc a, h
jp po, $+5  ; branch on overflow flag not set
xor a, 0x80 ; flip sign bit
jm negative ; actually do the test

But the 8080 isn't a strict subset of the Z80, and the code above won't work there --- on the 8080, arithmetic instructions set the P flag based on the parity of the result, with hilarious results.

So what's the idiomatic way to do signed comparisons on the 8080?

Actually calculating the overflow flag is possible but really painful as it requires doing bit operations on both the operands and the result, and I'm running out of registers. Also, it's not actually what I want here; I don't actually care about the overflow. I just want to do the comparison.

(I can't simply subtract and branch if the result is negative because that doesn't work in all cases. Consider INT_MIN < INT_MAX. That's 0x8000 - 0x7fff = 1, which is obviously positive.)

If all else fails, you can check if the operands have the same sign. If not, you already know which is greater. If yes, then it's safe to use a subtraction. As far as I can tell from a quick glance, that's what `sdcc` generates too. — Jester, Feb 11 '19 at 22:09
how about using unsigned values instead? no chance to offset everything in other code by like +0x8000 and operate on 0..ffff range? — Ped7g, Feb 11 '19 at 22:23
I'm a compiler backend. I don't get to decide whether int is signed or not! — David Given, Feb 13 '19 at 21:08
@DavidGiven that's technically incorrect. You are creating implementation for some kind of abstract machine (like C abstract machine), so as long as implementation acts "as if" signed integer is used, it's correct, even if the actual implementation is with unsigned native type. Now putting this into practice would be probably on similar level of what modern clang/gcc compilers do with autovectorizing/etc (like replacing naive "int" in loops with packed ints parallelization). I.e. clearly not even marginally worth the effort for obscure 8b platform... so you are practically correct. :) — Ped7g, Feb 13 '19 at 21:28

phuclv · Answer 1 · 2022-06-11T10:30:54.823

3

On the 8085 there are two undocumented flags K and V for signed comparison and signed overflow, along with undocumented instructions JK/JNK for jumping on signed less than/larger than

Not sure whether they're available on 8080 or not. But if they don't you can convert a signed comparison to an unsigned one and vice versa by toggling the top bit

bool signedCmp(int a, int b)
{
    return unsignedCmp(a ^ INT_MIN, b ^ INT_MIN);
}

I don't know 8080 assembly but maybe something like this can be used to compare DE and HL

mov a, e
sub a, l     ; e - l
mov a, h
xri a, 0x80
mov h, a     ; h ^= 0x80
mov a, d
xri a, 0x80  ; a = d ^ 0x80
sbb a, h     ; d - h
jc lessthan  ; branch on carry
    ; >= case
:lessthan
    ; < case

edited Jun 11 '22 at 10:30

answered Feb 12 '19 at 04:58

phuclv

37,963
15
156
475

nitpick: The inverse of less-than is greater-or-equal, not just "larger". – Peter Cordes Feb 12 '19 at 05:02
Sadly, these flags are not available on the 8080. But toggling the top bit looks pretty easy and I'll give it a try (although I am slightly running out of registers). – David Given Feb 12 '19 at 10:16
1

This does, in fact, work --- but it is an extra eight bytes on each comparison [sad face]: `mov a, d; xri 0x80; mov d, a; mov a, h; xro 0x80; mov h, a`. When comparing against a constant I can roll the xor into the immediate value but because `xri` resets the carry flag I can't do it inline the way you suggest above (it causes the `sbb` to go wrong). – David Given Feb 12 '19 at 10:57

Ped7g · Accepted Answer · 2019-02-14T09:14:43.230

Considering the accepted answer and its comments, I would reconsider Jester's advice (which seems to me only +4B when compared to proposed Z80 code, but with somewhat different code layout, i.e. where less/greater_equal branches reside, which may further complicate or simplify your code ... plus it should perform better than doing xor 0x80 every time to both D and H):

    mov     a,d
    xra     h
    jp      sameSigns   ; as "JNS" in 8086 / "jp p," in Z80
    ; sign bits are different, signed overflow may happen
    ; but if H positive, then DE is less than HL
    xra     d           ; make A=H and set sign flag
    jm      DeIsGreaterEqualThanHl
:DeIsLessThanHl
    ; DE < HL
    ...

:sameSigns
    ; sign bits are equal, it is safe to do ordinary sub
    mov     a,e
    sub     l
    mov     a,d
    sbb     h
    jc      DeIsLessThanHl
:DeIsGreaterEqualThanHl
    ; DE >= HL
    ...

You can also modify it as procedure which returns CF=1 when DE<HL by doing:

:SignedCmpDeHl
    mov     a,d
    xra     h
    jp      sameSigns   ; as "JNS" in 8086 / "jp p," in Z80
    ; sign bits are different, signed overflow may happen
    ; but if H positive, then DE is less than HL
    xra     d           ; make A=H and set sign flag (CF=0)
    rm                  ; return CF=0 when DE >= HL (H is negative)
    stc
    ret                 ; return CF=1 when DE < HL (H is positive/zero)
:sameSigns
    ; sign bits are equal, it is safe to do ordinary sub
    mov     a,e
    sub     l
    mov     a,d
    sbb     h
    ret                 ; return with CF=1 when DE < HL (CF=0 DE >= HL)

BTW you can turn CF=0/1 into A=0/~0 by sbb a - sometimes that 0/255 is handy for further calculations...

But as I commented under question, many time this is worth of revisit on architectural level, to see if the whole code logic can't be turned into unsigned 0..FFFF mode of operation, maybe it would lead to adjusting (by -32768) values like "_left" just at one/two specific spots (like final output to user), while many more other internal comparisons/usages would work in unsigned way.

EDIT:

Some variants for compares against constants (for one-time constant it's probably better (size-wise) to just load it into other RP and use the generic RP1 vs RP2 compare, especially if you have spare RP and the generic compare is already instantiated for other code ... but for multiple uses of the same constant this will probably win both size-wise and speed-wise ... inlining wins speed-wise? May get on par with subroutine, depends how the result is used).

reg-pair (actually also any 8b reg) against zero:

; signed compare 8b or 16b register vs 0, into SF, destroys A
    xra     a       ; A=0
    ora     R       ; 16b R=[HDB], or any 8b R: SF = (RP < 0 or R < 0)
    ...i.e. "jm hlIsLessThanZero"

; signed compare 8b or 16b register vs 0, into CF, destroys A
    mov     a,R     ; 16b R=[HDB], or any 8b R
    ral             ; CF = (RP < 0) or (R < 0)
    ...i.e. "jc hlIsLessThanZero" or "sbb a" to get 0/255

reg-pair against any 16b #XY constant:

; signed 16b compare RP (HL/DE/BC) vs nonzero constant #XY
; subroutine, returns CF=1 if RP < #XY, modifies A
    mov     a,R
    xri     0x80    ; convert 8000..7FFF into 0000..FFFF
    cpi     #X^0x80 ; "X" is xor-ed with 0x80 too to have it in 0000..FFFF range
    rnz             ; if ZF=0, then CF=1 is (RP < XY) and CF=0 is (RP > XY)
    ; R == X, the low 8b P vs Y will decide
    mov     a,P
    cpi     #Y      ; CF=1 if (RP < XY)
    ret             ; 10B for particular #XY constant and RP

; inlined form
    mov     a,R
    xri     0x80    ; convert 8000..7FFF into 0000..FFFF
    cpi     #X^0x80 ; "X" is xor-ed with 0x80 too to have it in 0000..FFFF range
    jnz     HiByteWasDecisive   ; if ZF=0, then CF is set correctly, done
    mov     a,P     ; R == #X, the low 8b P vs #Y will decide final CF
    cpi     #Y      ; CF=1 if (RP < #XY)
:HiByteWasDecisive
    ; CF=1 is (RP < #XY) and CF=0 is (RP >= #XY)
    ...

@PeterCordes `jp` on 8080 is `jns` on 8086 as if "jump positive/plus", then there is "jm" as if "jump minus" .. the parity jumps are `jpo/jpe`: http://www.classiccmp.org/dunfield/r/8080.txt But I wouldn't be shocked if there is 8080 assembler having some specific variant of these mnemonics, I'm not 8080 guy, but in the world of Z80 there are even clashing ways of naming some (unofficial) instructions, like `sll` used by some assemblers as alias for `sla` (shift left arithmetic = shift left logical), and some assemblers will assemble `sll` as unofficial `sl1/sli/sls/...` = shift left + b0=1. — Ped7g, Feb 13 '19 at 14:01
I mean, maybe that TXT I used is using unofficial mnemonics and `jns` it is officially even for 8080, that's why I put there comment explaining which one I had on mind, it's supposed to jump when SF=0 (and `rm` is supposed to `ret` when SF=1). — Ped7g, Feb 13 '19 at 14:04
Ah, thanks, I misinterpreted what they were saying. Those are some confusing mnemonics, especially once Z80 redefined the parity flag as signed-overflow so people are using it along with the sign flag for comparisons. — Peter Cordes, Feb 13 '19 at 14:06
@PeterCordes the Z80 actually does use that flag also as parity, and also for some third purpose, which I already forgot, it's signed-overflow only for specific instructions... :D ... lot of fun ;) (toying around with ZX Spectrum Next right now, so I run into all those Z80 nuances recently :) ) — Ped7g, Feb 13 '19 at 14:10
So what I currently have is 15 bytes for compare-with-register and 13 for compare-with-constant; farming comparisons out to a helper routine would be 6 bytes for compare-with-register (3 for the routine, 3 for the test) and 9 bytes for compare-with-constant (another three to load the value). But there'd also be a penalty for having to coerce values into specific registers which I currently don't have. I mean, it's _better_, and I'll do it, but I was hoping for more better... — David Given, Feb 13 '19 at 23:23
@DavidGiven that routine doesn't look to be constrained to any particular register except accumulator, i.e. any other reg pair combination should be possible just by mutating reg-bits in code? (too lazy to verify, 8080 is not that interesting to me). BUT about *constant* there should be much better way, will try to verify + edit answer if there is. But why would the constant require full calculation, that feels wrong. — Ped7g, Feb 13 '19 at 23:52
Using helper routines doing hl<>de ends up knocking 1.4% off the size of my binary --- thanks very much! But my point was that if the values aren't in hl and de, my compiler needs to marshal them into the right place, which can be expensive. I wonder whether I should do a set of dl<>he helper routines too. — David Given, Feb 14 '19 at 22:59
@DavidGiven I see. My point was, that if your app does compare hl — Ped7g, Feb 15 '19 at 09:10
@DavidGiven one more thing.. that routine to compare signed DE — Ped7g, Feb 15 '19 at 10:39

How do I efficiently do signed comparisons on the 8080?

2 Answers2

Linked