1

When i declare a string in assembly like that:

string DB "My string", 0

where is the string saved? Can i determine where it will be saved when declaring it?

zimmerrol
  • 4,872
  • 3
  • 22
  • 41
user8097385
  • 109
  • 2
  • 7
  • 1
    Your code will tell where it will be. Depending on the assembler there are different ways of doing it. – Sami Kuhmonen Dec 02 '17 at 08:17
  • Can you elaborate by what do you mean with "const strings" in the title vs the byte declaration in the question (which is just defining byte values; in no particular way showing "const" feature). If you do that `db` in common assembly in common data segment, then you **can modify** that memory during runtime. Runtime "const" on x86 is achieved by organizing memory layout into different pages, and having the pages with "const" data set to read-only access. There can also exist compile-time "const" only, like MASM has, or both, like C/C++ has. Not clear what you are asking about. – Ped7g Dec 02 '17 at 10:23
  • @Ped7g can't he simply write `mov si, offset string` SI register will store the offset of the string. Now with DS:SI registers he can locate his string in the memory. – Ahtisham Dec 02 '17 at 20:26
  • @Ahtisham yes, you can get the memory address by that in TASM/MASM. But that doesn't say anything where you declared it during source writing. You can actually deliberately position where the string should "land" in the binary, by declaring it at proper place in source code. If you want it in "data" segment, you have to declare it in "data" section, etc... You as programmer are in (almost) full control over the emitted machine code from assembler, so you can determine where it will be saved, by declaring it in the desired section of source. Getting its exact address is that simple `mov` then. – Ped7g Dec 02 '17 at 22:07
  • @Ped7g Then i think that is the simplest way to do that should i post it as answer ? – Ahtisham Dec 03 '17 at 02:17
  • 1
    @Ahtisham no, you are answering something else, not the OPs question. Your instruction gets address at runtime. OP is asking, how to deliberately design, where the data will "land" in the binary, and if it is possible to affect that (at least that's how I understand that question, it's not very clear). – Ped7g Dec 03 '17 at 03:02

1 Answers1

2

db assembles output bytes to the current position in the output file. You control exactly where they go.

There is no indirection or reference to any other location, it's like char string[] = "blah blah", not char *string = "blah blah" (but without the implicit zero byte at the end, that's why you have to use ,0 to add one explicitly.)


When targeting a modern OS (i.e. not making a boot-sector or something), your code + data will end up in an object file and then be linked into an executable or library.

On Linux (or other ELF platforms), put read-only constant data including strings in section .rodata. This section (along with section .text where you put code) becomes part of the text segment after linking.

Windows apparently uses section .rdata.

Different assemblers have different syntax for changing sections, but I think section .whatever works in most of the one that use DB for data bytes.


;; NASM source for the x86-64 System V ABI.

section .rodata            ; use section .rdata on Windows
string DB "My string", 0

section .data
static_storage_for_something: dd 123    ; one dword with value = 123
;; usually you don't need .data and can just use registers or the stack

section .bss                 ; zero-initialized memory, bytes not stored in the executable, just size
static_array: resd 12300000       ;; 12300000 dwords with value = 0

section .text
extern puts     ; defined in libc

global main
main:
    mov   edi, string      ; RDI = address of string = first function arg
    ;mov  [rdi], 1234      ; would segfault because .rodata is mapped read-only
    jmp   puts             ; tail-call puts(string)

peter@volta:/tmp$ cat > string.asm
  (and paste the above, then press control-D)
peter@volta:/tmp$ nasm -f elf64 string.asm  && gcc -no-pie string.o && ./a.out
My string
peter@volta:/tmp$ echo $?
10

10 characters is the return value from puts, which is the return value from main because we tail-called it, which becomes the exit status of our program. (Linux glibc puts apparently returns the character count in this case. But the manual just says it returns non-negative number on success, so don't count on this)

I used -no-pie because I used an absolute address for string with mov instead of a RIP-relative LEA.


You can use readelf -a a.out or nm to look at what went where in your executable.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 1
    On Linux, put strings into the `.strings` section so the linker can eliminate duplicate strings. – fuz Dec 02 '17 at 13:21
  • @fuz: Do you need `.size` symbol attributes for that to work? IDK how to do that with NASM. – Peter Cordes Dec 02 '17 at 13:30
  • 1
    I'm not entirely sure. I had to read all the documentation to be sure. You can set the symbol size in nasm [like this](http://www.nasm.us/doc/nasmdoc7.html#section-7.9.5). – fuz Dec 02 '17 at 14:13
  • @fuz: Feel free to edit this if you investigate further. I probably won't get back to this soon. – Peter Cordes Dec 02 '17 at 14:21
  • @PeterCordes Can't he simply write `mov si, offset string` SI register will store the offset of the string. Now with DS:SI registers he can locate his string in the memory. – Ahtisham Dec 02 '17 at 20:29
  • @Ahtisham: where does the OP say anything about writing 16-bit code? My example is for x86-64, where memory is flat: all segments (except FS and GS) implicitly have base=0. (And in the System V default position-dependent memory model (i.e. on Linux with `gcc -no-pie`), static symbols go in the low 2GiB of virtual memory, so their addresses fit in zero or sign-extended 32-bit immediates, and thus `mov r32, imm32` gives you the correct 64-bit address). And besides, the real answer is the part about what section to put it in, not how you later get the address. That's just an example. – Peter Cordes Dec 03 '17 at 04:46