1

Hello currently I'm doing a project in asm language and I came up with this code:

mov ah, 07h 
int 21h 
mov bl,al     

cmp bl, 'w'
je  up
cmp bl, 'W'
je  up 

The code is about entering a letter and jumping to another function. The thing I want to do is to compare it even if it is in either uppercase or lowercase. Is there a way to do it?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • `'w'` and `'W'` are just numbers. `'w'` == `'W' + 32`, look at an ASCII table for more info. – paladin May 10 '22 at 07:00
  • yes but i want to also lessen the lines of the code is there a way to compare with 'w' or 'W' somewhat like this cmp bl, 'w' or 'W]' – Jansen Lloyd Macabangun May 10 '22 at 07:03
  • 8086 has no OPCODE for doing 2 compares in one cycle. Your code cannot be shorten more. – paladin May 10 '22 at 07:15
  • @paladin: How does `'W' + 32` help? Unlike `'W' | 32`, that's not idempotent so it doesn't fold both inputs to one value to check for. (As in Sep's answer). Or I guess you weren't trying to propose an optimization, just saying it's *not* possible. (Which is correct in general for two arbitrary values, and the optimization here takes an extra instruction before the `cmp`) – Peter Cordes May 10 '22 at 20:31

1 Answers1

3

Because the uppercase letters [A,Z] (ASCII codes [65,90]) differ from the lowercase letters [a,z] (ASCII codes [97,122]) by 32, and because of how the ASCII table is organised, all the lowercase letters have their bit5 set while none of the uppercase letters have their bit5 set.

Make it case-insensitive

Before comparing you can or the character code with 32, and then you'll need just one comparison.

mov ah, 07h 
int 21h 
mov bl, al     

or  al, 32     ; Make case-insensitive
cmp al, 'w'    ; Only comparing lowercase, but accepting both cases
je  up

What this or al, 32 instruction does is:

  • if AL is [A,Z] it becomes [a,z]
  • if AL is [a,z] it remains [a,z]
Sep Roland
  • 33,889
  • 7
  • 43
  • 76
  • 2
    Also equally importantly, no non-letter ASCII code can become `w` either; `'W'` and `'w'` are the *only* inputs such that `c | 0x20 == 'w'`. Setting or clearing the lower-case bit doesn't make non-letters into letters or vice-versa, so in general this is a useful trick for things like checking if a character is a letter of the alphabet. ([What is the idea behind ^= 32, that converts lowercase letters to upper and vice versa?](https://stackoverflow.com/a/54585515) has more details) – Peter Cordes May 10 '22 at 20:27