4

I'm trying a simple assembly code:

.section .data
output:
    .ascii "The processor Vendor ID is 'xxxxxxxxxxxx'\n"
.section .bss
    .lcomm buffer, 12
.section .text
.code32
.globl _start
_start:
    movl $0, %eax
    cpuid
    movl $output, %edi

In .bss section I defined a variable named with "buffer"

When I try to get its address/value in gdb, it just prints:

(gdb) p $buffer
$1 = void

Using objdump, I found the name is not in ELF file, so how to keep such name information when running as and ld? Thank you!

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Yichen
  • 91
  • 1
  • 5
  • How do you assemble the source file? What commands and options do you use? – Some programmer dude Jun 30 '17 at 08:48
  • I tried `as -g -gstabs -gstabs+ -gdwarf-2 -ams` options, which can generate symbol name in *.o file, but this information is lost after `ld`. – Yichen Jun 30 '17 at 08:55
  • What `ld` command did you use? Also, you forgot `--32` on the `as` command line. So you're creating a 64-bit object file containing 32-bit machine code (thanks to the `.code32` directive). See my answer for details :P – Peter Cordes Jul 03 '17 at 04:14
  • Unless you're using a 32-bit Linux install, where `as` and everything else defaults to making 32-bit code. BTW, this code would assemble and run the same as 64-bit, but IDK what you have after it. – Peter Cordes Jul 03 '17 at 05:20

3 Answers3

6

Using objdump, I found the name is not in ELF file

Works for me on Arch Linux with GNU binutils 2.28.0-3. Maybe you stripped your binary after linking?

$ gcc -Wall -m32 -nostdlib gas-symbols.S
$ file a.out
a.out: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, BuildID[sha1]=d5fdff41cc52e9de3b4cdae34cf4129de2b4a69f, not stripped

$ nm a.out 
080490ee B __bss_start
080490f0 b buffer           ### local symbol in the bss
080490ee D _edata
080490fc B _end
080490c4 d output
080480b8 T _start

I didn't need -g to preserve symbols in the executable. And also, on my system, -static is the default for -nostdlib. This is not always the case, see this Q&A about building asm source into 32 or 64-bit static or dynamic binaries, with gcc or with as and ld directly. Or with NASM and ld.

(Note that .code32 doesn't change the object file format. You need to use build options, so it's probably better to omit .code32 so you're more likely to get errors (e.g. from push %ebx) if you try to build 32-bit code into a 64-bit object file.)

Using as and ld directly (which gcc does under the hood, use gcc -v to see how), I also get the same result.

$ as gas-symbols.S -o gas-symbols.o  --32 && 
  ld -o a.out gas-symbols.o  -m elf_i386
$ nm a.out 
...
080490b0 b buffer        ## Still there
...

In GDB, as Jester points out, print the address not the value. GDB doesn't know it's an array, since you didn't use any directives to create debug info. (I wouldn't recommend trying to write such directives by hand. e.g. look at what gcc -S emits for static char foo[100]; (in a file by itself.)

Anyway, GDB works if you use it right:

$ gdb ./a.out
(gdb) b _start
(gdb) r
Starting program: /home/peter/src/SO/a.out

Breakpoint 1, _start () at gas-symbols.S:10
(gdb) p buffer
$1 = 0
(gdb) p &buffer
$2 = (<data variable, no debug info> *) 0x80490f0 <buffer>
(gdb) ptype buffer
type = <data variable, no debug info>

You can work around the lack of type info by casting it, or using the x command:

(gdb) p (char[12])buffer
$4 = '\000' <repeats 11 times>
(gdb) p /x (char[12])buffer
$5 = {0x0 <repeats 12 times>}
(gdb) x /4w &buffer             # eXamine the memory as 4 "words" (32-bit).  
0x80490f0 <buffer>:     0x00000000      0x00000000      0x00000000      0x00000000
(gdb) help x   # read this to learn about options for dumping memory

For debugging asm, I have this in my ~/.gdbinit:

set disassembly-flavor intel
layout reg
set print static-members off

But since you're writing in AT&T syntax, you probably don't want intel-style disassembly. layout asm / layout reg is fantastic, though. See also debugging tips at the end of the tag wiki. The tag wiki is full of links to docs and guides.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
3

You should do p &buffer not p $buffer. The $ is the assembler syntax for an immediate operand, while in gdb that is a convenience variable (and register) prefix. To print the contents use something like x/12c &buffer or p (char[12])buffer

PS: debug info works for locals too, you don't need it to be global.

Jester
  • 56,577
  • 4
  • 81
  • 125
2

.lcomm defines a local common symbol. Common symbols only exist in object files, not executables, so they are not visible to ld.

If you want a symbol that is visible to ld, you should make it .global (or .globl, depending on your assembler).

The idea of common symbols is to allow you to have the same symbol defined in multiple compilation units. They go away after linking.

Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
  • Thank you. I guess it's a little bit inconvenient without name information. Do you think it's impossible to support it in as-ld-gdb tool chain, or just nobody tried to do it? After all, in C code, functions can use local variables with same name and gdb is able to tell them apart according to context. – Yichen Jun 30 '17 at 09:08
  • It isn't a limitation, it is by design. These are somewhat like `extern` declarations in C. I have no idea why you *want* these to be available post-link. Sounds like [an X-Y problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem) to me. What is it you're actually trying to do with these symbols? – Cody Gray - on strike Jun 30 '17 at 09:33
  • I wonder if a programmer declared symbols by .lcomm and trying to get their addresses and trace their values in gdb, what should he do? – Yichen Jun 30 '17 at 09:45
  • Those symbols *don't have* values after linking has occurred because those symbols *don't exist* anymore. – Cody Gray - on strike Jun 30 '17 at 10:42
  • @CodyGray: Actually, unless you tell the linker to strip the binary (`ld --strip-debug` or `--strip-all`), symbols that were in the object files are still there in the symbol table of the executable by default. – Peter Cordes Jul 03 '17 at 03:42
  • At first I thought the problem was that `.lcomm` made a truly local label that didn't go in the object file at all. But then I checked, and it worked for me when I assembled + linked with `gcc -m32 -nostdlib`. (See my answer). – Peter Cordes Jul 03 '17 at 04:02