-1

If somebody can help me with some code in assembly.Convert every char to integer. For example: A to 1, B to 2,C to 3,D to 4, etc... until Z to 26

  • 2
    In all western charsets and encodings you just have to subtract 64 to transform A, B, C, ... to 1, 2, 3, ... or subtract 96 to transform a, b, c, ... to 1, 2, 3, ... . Validate your input. Also [check an ASCII table](https://www.cs.cmu.edu/~pattis/15-1XX/common/handouts/ascii.html) (the code-points and the encoding of the first 127 chars are equals among all the common western encodings) – Margaret Bloom Feb 01 '18 at 19:59
  • @MargaretBloom do you have some code or something similiar so i can see what happens because i dont have much experience and everything i found on google doesnt seem to help me because this is a part of my project..i need to complete it with many different tasks – Tijana Lazarova Feb 01 '18 at 20:36
  • There is something [in this post](https://stackoverflow.com/questions/13595808/assembly-nasm-how-to-do-numerical-operations). – Margaret Bloom Feb 01 '18 at 20:42

1 Answers1

-1

One of the simplest methods for getting a numerical equivalent to an alphabetic ASCII character is to strip bits 5, 6 & 7.

A = 64 = 41H = 0100 0001
Z = 90 = 5AH = 0101 1010

a = 97 = 61H = 0110 0001
z = 122 = 7AH = 0111 1010

So stripping the high order 3 bits will leave you with the equivalent number no matter upper or lower case.

    and    al, 1FH

It is a good idea to range check by

    lodsb
    and    al, 5FH        ; Converts to upper case
    cmp    al, 'A'
    jb     Error
    cmp    al, 'Z'
    ja     Error

    xor    al, 40H        ; Turns off only bit that doesn't matter
                          ; and now your left with 1 to 26 in AL

If you want to zero index your value, just;

   dec     al
Shift_Left
  • 1,208
  • 8
  • 17
  • 3
    You could reduce it to one compare by subtracting 'A' from the Character and then compare with 25. If it is above then it is out of range. – Michael Petch Feb 02 '18 at 01:35
  • as Michael said, and add " ... if it's not out of range, it is the result (mind the offset: A=0, B=1, C=2 etc)" – Tommylee2k Feb 02 '18 at 09:24
  • If you want to zero-index your value, `sub al, 'A'` instead of using `xor`. There's no advantage to using bit-twiddling instead of a `sub` here. Also, your `and` masks more than just the upper/lower case bit, so you end up accepting some bytes as upper-case characters when in fact they were non-ASCII bytes with the high bit set. Instead, `and al, ~0x20`, or force to lowercase with `or al, 0x20`. – Peter Cordes Feb 03 '18 at 02:41
  • re: the unsigned compare trick: see https://stackoverflow.com/a/35936844/224132 for efficient asm that flips the case of ASCII alphabetic characters only. related: C++ intrinsics for a vectorized string `toupper`: https://stackoverflow.com/a/37151084/224132 using SSE2. – Peter Cordes Feb 03 '18 at 02:46