36

well, I know that CLD clears direction flag and STD sets direction flag. but what's the point in setting and clearing direction flag?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Melika
  • 411
  • 1
  • 5
  • 9

5 Answers5

48

The direction flag is used to influence the direction in which string instructions offset pointer registers. These are the same instructions that can be used with the REP prefix to repeat the operation. (Although lods isn't very useful with rep).

The string instructions are: MOVS (copy mem to mem), STOS (store AL/AX/EAX/RAX), SCAS (scan string), CMPS (compare string), and LODS (load string). There's also ins/outs for copying between memory and an IO port. Each of these instructions is available in byte, word, dword, and qword operand sizes.

In a nutshell, when the direction flag is 0, the instructions work by incrementing the pointer to the data after every iteration (until ECX is zero or some other condition, depending on the flavour of the REP prefix), while if the flag is 1, the pointer is decremented.

For example, movsd copies a dword from [ds:esi] to [es:edi] (or rdi in 64-bit mode), and does this: (See the "Operation" section in the linked ISA reference manual entry extracted from Intel's PDFs)

dword [es:edi] = dword [ds:esi]      // 4-byte copy memory to memory
if (DF == 0)
    esi += 4;
    edi += 4;
else  // DF == 1
    esi -= 4;
    edi -= 4;
fi

With a REP prefix, it does this ECX times, and modern x86 CPUs have optimized "fast strings" microcode that does the copying (or stos storing) with 16-byte or 32-byte internal operations. See also this Q&A about memory bandwidth and the ERMSB feature. (Note that only rep stos and rep movs are optimized this way, not repne/repe scas or cmps).

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Daniel Kamil Kozar
  • 18,476
  • 5
  • 50
  • 64
  • 2
    Here is a [minimal runnable example with assertions](https://github.com/cirosantilli/linux-kernel-module-cheat/blob/6a9299599e781b29abfce64e4923ab0af3ef731d/userland/arch/x86_64/stos.S). – Ciro Santilli OurBigBook.com Jun 19 '19 at 21:08
12

CLD CLears the Direction flag, data goes onwards. STD SeTs the Direction flag, data goes backwards.

Kenneth Zarr
  • 129
  • 1
  • 2
4

CLD: clear direction flag so that string pointers auto increment after each string operation

STD: std is used to set the direction flag to a 1 so that SI and/or DI will automatically be decremented to point to the next string element when one of the string instruction executes.If the direction flag is set SI/DI will be decremented by 1 for byte strings and 2 for word strings.

This answer can be helpful for you.

Tanim_113
  • 321
  • 1
  • 3
  • 11
4

If using Windows, then as per the STDCALL calling convention -

Under STDCALL, the direction flag is clear on entry and must be returned clear.

So if you set DF, then before an API call you must clear it.

Gunner
  • 5,780
  • 2
  • 25
  • 40
  • Operating System Dependant. – amanuel2 Oct 04 '16 at 20:02
  • This is common to most 32-bit / 64-bit calling conventions, including i386 System V and x86-64 System V. It lets you (or the compiler) efficiently inline `rep movsd` / `rep stosd` without CLD instructions. (On modern x86, they're often only fast going upward, with DF=0) – Peter Cordes Dec 03 '17 at 19:51
3

CLD: Clears the DF flag in the EFLAGS register. When the DF flag is set to 0, string operations increment the index registers (ESI and/or EDI).

here is a simple example:

section .text
global main
main:
    mov ecx, len
    mov esi, s1
    mov edi, s2

    cld       ; redundant because DF is already guaranteed to be 0 on function entry
              ; but included for illustration purposes

loop_here:
    lodsb                ; AL=[esi],  ESI+=1 (because DF=0, otherwise ESI-=1)
    add al, 02
    stosb                ; [edi]=AL,  EDI+=1 (because DF=0, otherwise EDI-=1)
    loop loop_here       ; like dec ecx / jnz but without setting flags
    ; ECX=0, EDI and ESI pointing to the end of their buffers

    mov edx, len-1       ;message length, not including the terminating 0 byte
    mov ecx,s2           ;message to write
    mov ebx,1            ;file descriptor (stdout)
    mov eax,4            ;system call number (sys_write)
    int 0x80             ;call kernel

    mov  eax,1           ;system call number (sys_exit)
    xor  ebx,ebx
    int  0x80            ;call kernel: sys_exit(0)

section .data
s1: db 'password', 0        ; source buffer
len equ $-s1

section .bss
s2: resb len                ; destination buffer

(assemble and link with nasm -felf32 caesar.asm && gcc -no-pie -m32 caesar.o -o caesar. Or link it into a static executable with this as _start instead of main if you like.)

(this example tried to implement Caesar cipher.)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
N.S
  • 1,363
  • 15
  • 18
  • 1
    The i386 System V calling convention already guarantees DF=0 on entry to a function, which is why your program works reliably even though you use `lodsb`/`stosb` before running `cld`. The `rep movsb` runs with ECX=0 (from `loop`), so it copies zero bytes past the end of `s1`/`s2`, and it doesn't matter what DF is set to at that point. – Peter Cordes Aug 14 '18 at 05:33
  • BTW, [`loop` is slow](https://stackoverflow.com/questions/35742570/why-is-the-loop-instruction-slow-couldnt-intel-have-implemented-it-efficiently), don't use it unless you're optimizing for code-size over speed. Also, you probably want to wrap around the last 2 letters of the alphabet to the first 2. So it does make sense to load and operate on a byte in a register, but only if you're going to implement a check for wrapping. Otherwise you could modify a string in-place with `add byte [esi], 2` / `inc esi`. Or do 4 bytes at once with `add dword [esi], 0x02020202`, if it won't carry... – Peter Cordes Aug 14 '18 at 05:37
  • thank you. i try to fix issues you mentioned in my code :) – N.S Aug 14 '18 at 06:35