2

I am writing a program on bare metal 16 bit real mode assembly with nasm. I want to sleep (pause execution) for x amount of milliseconds however I have not found a way to do this.

Edit: This is my code. I want to add about 0.3 seconds of delay between each character getting typed to the screen.

[bits 16]    ; use 16 bits
[org 0x7c00] ; sets the start address

init: 
  mov si, msg  ; loads the address of "msg" into SI register
  mov ah, 0x0e ; sets AH to 0xe (function teletype)
print_char:
  lodsb     ; loads the current byte from SI into AL and increments the address in SI
  cmp al, 0 ; compares AL to zero
  je done   ; if AL == 0, jump to "done"
  int 0x10  ; print to screen using function 0xe of interrupt 0x10
  jmp print_char ; repeat with next byte
done:
  hlt ; stop execution

msg: db "The quick brown fox jumps over the lazy dog.", 0 ; we need to explicitely put the zero byte here

times 510-($-$$) db 0           ; fill the output file with zeroes until 510 bytes are full
dw 0xaa55                       ; magic number that tells the BIOS this is bootable
H4ZE
  • 183
  • 1
  • 1
  • 9
  • 2
    There's [How to set 1 second time delay at assembly language 8086](https://stackoverflow.com/q/15201955) but most of those answers aren't millisecond-accurate. Or do you mean without even using BIOS services? Are you talking about IBM-PC compatible hardware with (emulated) legacy timer chips? (Or real chips on a retro system?) If you want to use a delay loop on modern x86, with interrupts enabled you should spin on RDTSC like in this Q&A: [How to calculate time for an asm delay loop on x86 linux?](https://stackoverflow.com/q/49924102) – Peter Cordes Oct 12 '21 at 21:43
  • 3
    Does the system have a Programmable Interval Timer? – Jim Rhodes Oct 12 '21 at 21:56
  • Since you have the BIOS available (you're already using `int 0x10`) and your delay is much longer than a millisecond, `int 0x15` / `ah = 0x86` should fit the bill perfectly, see https://stackoverflow.com/a/22179837/634919 on Peter's link. It's also more polite and power-friendly than a busy-wait on a modern system or emulator; a BIOS can halt the CPU and an emulator can give up a timeslice. – Nate Eldredge Oct 13 '21 at 22:00

1 Answers1

1

One day, I too needed a delay routine capable of doing delays ranging from 0.5 sec to just a few msec. Read all about it in this CodeReview question, and especially the reason why I needed to take this approach.

My solution was to find out how many iterations a delay routine can do in the interval between 2 ticks of the standard 18.2Hz timer. Those ticks are 55 msec apart. Because sometimes measurements can be erratic I only accepted the results if 2 consecutive measurements varied by less than 1%%. Finally I divided the good measurement by 55 to obtain the number of iterations per msec aka SpeedFactor. Hereafter, whenever I wanted to pause the program I multiplied the desired delay expressed in msec by this SpeedFactor and then performed that number of iterations within the delay routine.

The full code:

[bits 16]
[org 0x7C00]

                xor     ax, ax
                mov     ds, ax
                mov     es, ax
                mov     ss, ax
                mov     sp, 0x7C00
                cld

; Measure the number of iterations (within the ShortWait routine) per msec
; Only accept if consecutive measurements vary by less than 1%%
; If measurements remain erratic than do accept the last one
                mov     bp, 10                  ; Max try
                call    GetSpeedFactor          ; -> DX:AX
.a:             xchg    si, ax                  ; 'mov si, ax'
                mov     di, dx
                call    GetSpeedFactor          ; -> DX:AX
                push    ax dx                   ; (1)
.b:             sub     ax, si
                sbb     dx, di
                jnb     .c
                add     ax, si
                adc     dx, di
                xchg    si, ax
                xchg    di, dx
                jmp     .b
.c:             mov     cx, 1000
                xchg    ax, cx
                mul     dx
                xchg    ax, cx
                mov     dx, 1000
                mul     dx
                add     dx, cx
                sub     si, ax
                sbb     di, dx
                pop     dx ax                   ;(1)
                cmc
                dec     bp
                jnbe    .a
                mov     [SpeedFactor], ax
                mov     [SpeedFactor+2], dx

                mov     si, msg
                lodsb
More:           mov     bx, 0x0007              ; BH DisplayPage 0, BL GraphicsColor 7
                mov     ah, 0x0E                ; BIOS.Teletype
                int     10h
                mov     bx, 300                 ; 0.3 sec 
                call    Pause
                lodsb
                cmp     al, 0
                jne     More

                cli
                hlt
                jmps    $-2

msg             db      "The quick brown fox jumps over the lazy dog.", 0
SpeedFactor     dd      0
; ----------------------------------------------
; IN () OUT (dx:ax)
; Wait for the start of a new TimerTick period (54.9254 msec)
; Then measure a 4 tick period (219.7016 msec)
GetSpeedFactor: push    bx cx
                mov     bx, 1
                call    .ShortWait              ; -> DX:AX BX=0
                mov     bl, 4                   ; BH=0
                call    .ShortWait              ; -> DX:AX BX=0
                mov     cx, 10
                xchg    ax, cx
                mul     dx
                xchg    ax, cx
                mov     dx, 10
                mul     dx
                add     dx, cx
                mov     cx, 2197
                xchg    ax, bx                  ; BX=0
                xchg    dx, ax
                div     cx
                xchg    ax, bx
                div     cx
                mov     dx, bx
                pop     cx bx
                ret
; - - - - - - - - - - - - - - - - - - - - - - -
.ShortWait:     mov     ax, -1
                cwd
; ---   ---   ---   ---   ---   ---   ---   ---
; IN (dx:ax,bx) OUT (dx:ax,bx)
; Do DX:AX iterations or loop until Timer did BX Ticks
ShortWait:      push    ds cx si di
                xchg    si, ax                  ; 'mov si, ax'
                mov     di, dx
                xor     ax, ax
                cwd
                mov     ds, ax
.a:             mov     cx, [046Ch]             ; BIOS Timer
.b:             sub     si, 1
                sbb     di, 0
                jb      .c
                add     ax, 1
                adc     dx, 0
                cmp     cx, [046Ch]
                je      .b
                dec     bx
                jnz     .a
.c:             pop     di si cx ds
                ret
; ----------------------------------------------
; IN (bx) OUT ()
Pause:          push    ax bx dx
                mov     ax, [SpeedFactor+2]
                mul     bx
                xchg    bx, ax
                mul     word [SpeedFactor]
                add     dx, bx
                mov     bx, -1
                call    ShortWait               ; -> DX:AX BX
                pop     dx bx ax
                ret
; ----------------------------------------------

times 510-($-$$) db 0
dw 0xAA55   

The code assembles with FASM. For NASM, you will need to change code like

push ax bx dx
...
pop  dx bx ax

into

push ax
push bx
push dx
...
pop  dx
pop  bx
pop  ax
Sep Roland
  • 33,889
  • 7
  • 43
  • 76
  • Your code assumes constant CPU frequency. That won't be guaranteed on some modern systems, even in 16-bit mode. I think the BIOS leaves the CPU in the highest P-state setting when booting an MBR, so turbo is up to the CPU. A low-power laptop might have to throttle back from max turbo after some time due to thermal limits. (Maybe not likely just running scalar integer code on modern CPUs, but probably possible for some generations of hardware, like maybe Sandybridge or Nehalem from 10 to 13 years ago. Or on a laptop with purely passive cooling, like a Surface 2-in-1 tablet/laptop.) – Peter Cordes Oct 13 '21 at 21:55
  • Your calibration will probably happen with the CPU at max turbo, especially since you're discarding early unstable results. If CPU frequency drops later, ShortWait calls will wait too long. – Peter Cordes Oct 13 '21 at 21:57
  • @PeterCordes The code was written for use in regular programs. I have never seen any irregularities. I have never tried to use it in the boot phase and I am not aware about these technical details, but I would expect the BIOS.TimerTick to be correct even in a bootloader. – Sep Roland Oct 13 '21 at 22:05
  • I'd expect the BIOS timer tick interval to be correct always, too. But it's possible that the number of loop iterations you can do in one tick interval might *not* be, depending on the system. *Most* systems will be able to sustain max turbo on a single core basically all the time, even while running this delay loop and/or whatever other 16-bit code is in some program that uses it (except maybe SSE2 SIMD FP math), so I'm not surprised you haven't seen variability on your hardware. – Peter Cordes Oct 13 '21 at 22:24
  • Also, on most desktops it would only be a small effect, like 3.4 GHz vs. 3.8 GHz on a Haswell i5 4670. (Look at the non-"T" models in https://en.wikipedia.org/wiki/Haswell_(microarchitecture)#Desktop_processors). And that's if turbo drops all the way down from max to non-turbo, rather than to some lower turbo level. Sleeping for 3.8/3.4 = 1.11 times too long wouldn't be very noticeable if you weren't looking for it. Low-power laptops have bigger ratios of sustainable vs. peak turbo clocks. – Peter Cordes Oct 13 '21 at 22:28
  • Or maybe I'm wrong and most BIOSes don't leave turbo enabled at all, in which case yeah you're fairly likely to see constant CPU frequency, unless the vents are blocked or fan is broken and you get thermal throttling to below the normal stock non-turbo frequency. For delays on the order of milliseconds, out-of-order exec effects aren't much of a problem. (i.e. overlapping execution of stuff before/after the delay loop with the actual delay.) – Peter Cordes Oct 13 '21 at 22:31
  • IIRC, AMD CPUs "boost" limits are dynamically determined based on temperature (and power-supply stability?). So actual performance is more variable, and might even vary with workload. So if anyone's looking for this effect, a modern AMD CPU might be easier to see it in. Or like I said, a low-power laptop, especially with passive cooling. [Why can't my ultraportable laptop CPU maintain peak performance in HPC](https://stackoverflow.com/q/36363613) shows dropping CPU frequency with SIMD FP workloads. With single-core integer workloads, even that machine might maintain peak turbo, IDK. – Peter Cordes Oct 13 '21 at 22:35