1

Given a certain function in LLVM bit code, how can I identify its local variables?. For example, the following snippet from GNU coreutils echo utility, I don't know how to find the variable do_v9 in the scope of the main IR code.

int main (int argc, char **argv)
{
  bool display_return = true;
  bool posixly_correct = getenv ("POSIXLY_CORRECT");
  ....
  bool do_v9 = false;
}

I noticed LLVM creates a metadata for local variables, called DILocalVariable, where this variable will be replaced with a number starts with the letter i.

!686 = !DILocalVariable(name: "posixly_correct", scope: !678, file: !10, line: 114, type: !64)
!688 = !DILocalVariable(name: "do_v9", scope: !678, file: !10, line: 122, type: !64)

So the main IR code contains this neither the variable do_v9 nor its corresponding metadata !688, except for the value besides the definition of the main function. My analysis loops over the instructions in the main function, but I don't know how to find this local variable within my iteration. Where I'm using LLVM 6.0.

; Function Attrs: nounwind uwtable
define i32 @main(i32, i8**) #9 !dbg !678 {
  %3 = alloca i32, align 4
  %4 = alloca i32, align 4
  %5 = alloca i8**, align 8
  %6 = alloca i8, align 1
  %7 = alloca i8, align 1
  %8 = alloca i8, align 1
  %9 = alloca i8, align 1
  %10 = alloca i32
  %11 = alloca i8*, align 8
  %12 = alloca i64, align 8
  %13 = alloca i8*, align 8
  %14 = alloca i8, align 1
  %15 = alloca i8, align 1
Mohannad
  • 93
  • 12
  • The metadata you refer to is part of the dwarf debugging information. Did you compile with debug symbols enabled? – Dwight Guth May 25 '20 at 04:50
  • I used the instructions in this website to compile coretuils https://klee.github.io/tutorials/testing-coreutils/. I think it uses `obj-llvm$ CC=wllvm ../configure --disable-nls CFLAGS="-g -O1 -Xclang -disable-llvm-passes -D__NO_STRING_INLINES -D_FORTIFY_SOURCE=0 -U__OPTIMIZE__" ` uses the `-g` to enable debugging – Mohannad May 25 '20 at 13:05
  • Does your IR contain any calls to the llvm.dbg.declare or llvm.dbg.addr intrinsics then? – Dwight Guth May 25 '20 at 18:45
  • I have calls in the IR only for `llvm.dbg.declare`. BTW, I recompiled the coretuils without the flag `-g` and still have the same debugging info. – Mohannad May 25 '20 at 19:13

1 Answers1

0

If you want to identify a local variable from your source code in llvm IR using the debug information emitted by the compiler, you can do this by looking at the calls to the @llvm.dbg.declare or @llvm.dbg.addr intrinsics in your source code. You will have either one or the other (but not both; the llvm.dbg.addr function replaces llvm.dbg.declare in newer versions of llvm) present once for each local variable in your function. For example, if you have the following:

%1 = alloca i32, align 4
call void @llvm.dbg.addr(metadata i32* %1, metadata !2, metadata ...), !dbg ...
!2 = !DILocalVariable(name: "i", ...)

This tells us that local variable i corresponds to the stack location allocated by the alloca whose address is %1.

Note that the ... above just represents stuff we don't care about in this context.

Dwight Guth
  • 759
  • 4
  • 11
  • Thanks, Dwight for your help. I found these two posts that also provide hints for obtaining local variables. https://stackoverflow.com/questions/59479206/llvm-retrieve-name-of-allocainst https://stackoverflow.com/questions/21410675/getting-the-original-variable-name-for-an-llvm-value But still couldn't figure out some aspects – Mohannad May 25 '20 at 20:48