2

It is my first time writing in Assembly, so the code might not be perfect (or even good), but it is how I worked it out, don't judge too much :) . It is written for Intel x86 using DOSbox and Turbo Assembler/Linker/Debugger if that makes any difference. The task is to get a line from the user and convert all uppercase into lowercase (leaving everything else as is). The problem is that when I print each symbol one by one and add '\n'(new line) after it, it works fine, however, I need my output to be in a single line, so when I try to print without '\n', it skips the very first symbol. Why it does that, how to fix it, and why it works with '\n'?

; Task is to get input from the user and convert any upper case letters into lower case
; and print modified line

.model small
.stack 100h

.data
    start_msg db "Enter a line:", 0Dh, 0Ah, 24h    ; "line\n", $
    out_msg db "Converted line:", 0Dh, 0Ah, 24h
    buffer db 100, ?, 100 dup(0)
.code

start:
    mov dx, @data                  ; moves data into dx
    mov ds, dx                     ; moves dx (data) into data segment

    ; prints start_msg
    mov     ah, 09h
    mov     dx, offset start_msg
    int     21h
    
    ; reads inputed line and puts it into buffer
    
    mov ah, 0Ah
    mov dx, offset buffer
    int 21h
    
    ; prints '\n' 
    mov dl, 0Ah 
    mov ah, 02h
    int 21h
    
    ; prints out_msg
    mov ah,  09h
    mov dx, offset out_msg
    int 21h
    
    ; puts pointer to the first character of inputed line in bx
    mov bx, offset buffer + 2
    loopString: 
        
        mov dl,[bx] ; reads a simbol from buffer and puts into dl
    
        cmp dl, 'A' ; if symbol is "more" than 'A', jump to ifUpper
        jae ifUpper
        
        ; prints simbol
        print:
            mov ah, 02h
            int 21h
        
        ; checks, if line ended
        inc bx ; bx++
        cmp dl, 0
        je endLoop ; ends looping
        
        ; temp part of code, puts '\n' after each printed symbol 
        ; if it is commented, skips first character from inputed line
        ; everything works if used, however I need final output to be in one line. 
        ;mov dl, 0Ah ; Uncomment me
        ;mov ah, 02h ; Uncomment me
        ;int 21h ; Uncomment me
        
        jmp loopString ; if line end was not detected, loops again 
        
    ifUpper:
        cmp dl, 'Z' ; checks if symbol is "more" than 'Z'
        ja print ; it is not upper, just prints the symbol
        add dl, 20h ; +32 (converts upper to lower)
        jmp print ; prints converted symbol
        
    endLoop:
    
    mov ah, 4ch             ; back to dos
    mov al, 0               ; error
    int 21h                
end start

Un\comment lines that say "Uncomment me" to see outputs with and without '\n'. Thanks in advance.

  • 2
    You are printing the "null term" (a char with value 0). This somehow messes up the first char on the current line. Move the check `cmp dl, 0` just after `mov dl, [bx]`. You can size the opportunity to rewrite the loop logic more linearly. `print` should be the last operation in the loop and then use the fallthrough logic by writing the two checks back to back and with their complement conditions. Before the `print` label you add `20h` to the char. All that said, it's a good piece of code for a first timer :) – Margaret Bloom Sep 19 '21 at 14:56
  • @MargaretBloom Thank you very much, it worked! Do you have any idea what could have caused it exactly? – Vytenis Kajackas Sep 19 '21 at 15:29
  • 2
    The null char is printed as a space. For some reason, it acts like a `\r` and resets the cursor position to the first column of the current row. To be honest, I was unable to find evidence of such behavior in the DOSBox sources but it must be there. – Margaret Bloom Sep 19 '21 at 16:46
  • 1
    @MargaretBloom DOSBox has it right. The zero that the OP found is preceded by the carriage return that DOS always appends to the inputted characters. The cursor already moved to the start of the line and a space overwrote the first character. – Sep Roland Sep 19 '21 at 22:23
  • Thanks @SepRoland, that makes much more sense now. I didn't know DOS ends an inputted string with a CR. – Margaret Bloom Sep 20 '21 at 08:45

1 Answers1

2
cmp dl, 0
je endLoop ; ends looping

This is the problematic part of your code. You should be comparing to the value 13 instead. I suggest you read about how the DOS.BufferedInput function works in this Q/A. DOS always includes the carriage return as a terminating byte.

  • You have chosen to compare with zero because you have started from an all-zeroes input buffer. However, the zero that you are looking for might not even be there anymore by the time that DOS returns control to your code! If the user at the keyboard first types a (very) long text and then starts backspacing the zero is gone.

  • In case the zero is still in the right place, there's a second reason why it fails. You have placed the check for zero after printing it, which doesn't make sense since it's not part of the input, is it? Now an ASCII code 0 prints as a space character and because the byte preceding the zero was inevitably a carriage return (13), the cursor already moved to the beginning of the line and the first character on the line was erased. There's really nothing magic about printing an ASCII code 0.

This is how you could write it. Because you do want to print the terminating carriage return, even an 'empty input' consists of one byte. Therefore a Repeat-Until loop is fine.

    mov  bx, offset buffer + 2
Repeat:
    mov  dl, [bx]
    cmp  dl, 'A'
    jb   print
    cmp  dl, 'Z'
    ja   print
    add  dl, 32     ; Make LCase
print:
    mov  ah, 02h
    int  21h
    inc  bx
    cmp  dl, 13
    jne  Repeat     ; Until 13 was printed
    mov  dl, 10 
    int  21h
Sep Roland
  • 33,889
  • 7
  • 43
  • 76