Correct, there's a gap of a few non-alphabetic characters between 'Z'
and 'a'
.
The most efficient way is to set the lower-case bit with an OR, then use the range-check trick of sub + unsigned compare. This of course only works for ASCII, not extended character sets where there are other ranges of alphabetic characters. Note that or al, 0x20
can never create a lower-case character if the original wasn't an upper-case character, because the ranges are "aligned" the same relative to a mod 32 boundary of ASCII codes.
Arrange your loop structure with the conditional branch at the bottom. Either enter the loop with a jmp
to that load and test, or peel that part of the first iteration. (Why are loops always compiled into "do...while" style (tail jump)?)
Use movzx
loads to avoid a false dependency on merging a low byte into EAX when writing AL.
; ESI = pointer to the string
xor ecx, ecx ; index = 0
movzx eax, byte ptr[esi] ; test first character
test eax, eax
jz .done ; skip the loop on empty string
; alternative: jmp .next_char to enter the loop
.loop: ; do{
inc ecx
mov edx, eax ; save a copy of the original if needed
;;;; THESE 4 INSTRUCTIONS ARE THE ALPHA / NON-ALPHA TEST
or al, 0x20 ; force lowercase
sub al, 'a' ; AL = 0..25 if alphabetic
cmp al, 'z'-'a'
ja .non_alphabetic ; unsigned compare rejects too high or too low (wrapping)
;; do something if it's a letter
jmp .next_char
.non_alphabetic:
;; do something different, then fall through
.next_char:
movzx eax, byte ptr[esi + ecx]
test eax, eax
jnz .loop ; }while((AL = str[i]) != 0);
.done:
If the input is before 'a', sub al, 'a'
will be signed negative, or as unsigned will wrap to a high value, so cmp al, 'z'-'a'
/ ja
will reject it.
If the input is after 'z'
, sub al, 'a'
will leave a value higher than 25 ('z'-'a'
), so the unsigned compare will reject it also.
Compilers use this unsigned compare trick when compiling a C expression like c <= 'z' && c >= 'a'
, so you can be sure it works the same as that expression for every possible input.
Other style notes: normally you'd just increment ESI, instead of having both a pointer and an index. Also, you may not need mov edx, eax
if you can use the AL value (0-25 index into the alphabet). Making a copy and using this "destructive" test is usually better than 2 separate branches.
NASM syntax allows character constants like C, so you can write 0x41
as 'A'
, or 0x7A
as 'z'
. e.g. cmp al, 'a'
. Then you don't even need to comment the line.
Writing it that way (with the next_char
label at the top of the loop) saves a jmp
at the bottom. Fewer instructions in the loop = better. The only point of writing asm these days is performance, so it makes sense to learn good techniques like this from the start, if it's not too confusing. No assembly answer would be complete without a link to http://agner.org/optimize/.
output of ascii(1)
, or http://www.asciitable.com/
Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex
0 00 NUL 16 10 DLE 32 20 48 30 0 64 40 @ 80 50 P 96 60 ` 112 70 p
1 01 SOH 17 11 DC1 33 21 ! 49 31 1 65 41 A 81 51 Q 97 61 a 113 71 q
2 02 STX 18 12 DC2 34 22 " 50 32 2 66 42 B 82 52 R 98 62 b 114 72 r
3 03 ETX 19 13 DC3 35 23 # 51 33 3 67 43 C 83 53 S 99 63 c 115 73 s
4 04 EOT 20 14 DC4 36 24 $ 52 34 4 68 44 D 84 54 T 100 64 d 116 74 t
5 05 ENQ 21 15 NAK 37 25 % 53 35 5 69 45 E 85 55 U 101 65 e 117 75 u
6 06 ACK 22 16 SYN 38 26 & 54 36 6 70 46 F 86 56 V 102 66 f 118 76 v
7 07 BEL 23 17 ETB 39 27 ' 55 37 7 71 47 G 87 57 W 103 67 g 119 77 w
8 08 BS 24 18 CAN 40 28 ( 56 38 8 72 48 H 88 58 X 104 68 h 120 78 x
9 09 HT 25 19 EM 41 29 ) 57 39 9 73 49 I 89 59 Y 105 69 i 121 79 y
10 0A LF 26 1A SUB 42 2A * 58 3A : 74 4A J 90 5A Z 106 6A j 122 7A z
11 0B VT 27 1B ESC 43 2B + 59 3B ; 75 4B K 91 5B [ 107 6B k 123 7B {
12 0C FF 28 1C FS 44 2C , 60 3C < 76 4C L 92 5C \ 108 6C l 124 7C |
13 0D CR 29 1D GS 45 2D - 61 3D = 77 4D M 93 5D ] 109 6D m 125 7D }
14 0E SO 30 1E RS 46 2E . 62 3E > 78 4E N 94 5E ^ 110 6E n 126 7E ~
15 0F SI 31 1F US 47 2F / 63 3F ? 79 4F O 95 5F _ 111 6F o 127 7F DEL