4

What I am trying to achieve is to avoid constant-folding of some constants (which represent addresses in my code) such as the 100000000 constant below. I need this because later the JIT-compiled code might get patched, which changes the constants due to object relocation.

The code below is my best effort to avoid constant-folding (at all costs). It doesn't work. I end up with constant 100011111 in the instruction stream.

llc -O0 code.ll -print-after-all reveals that the folding happens at Expand ISel Pseudo-instructions pass.

; ModuleID = '0'
target triple = "x86_64-unknown-linux-gnu"

define  i64 @"0"() {
BlockEntry0:
  %cell = alloca i64, align 8
  store volatile i64 0, i64* %cell, align 8
  %volatile_zero3 = load volatile i64, i64* %cell, align 8
  %base = add i64 %volatile_zero3, 100000000
  %volatile_zero4 = load volatile i64, i64* %cell, align 8
  %opaque_offset = add i64 %volatile_zero4, 11111
  %casted_base = inttoptr i64 %base to i8*
  %gep = getelementptr i8, i8* %casted_base, i64 %opaque_offset
  %as_ptr = bitcast i8* %gep to i64*
  %loaded = load i64, i64* %as_ptr, align 4
  %as_function = inttoptr i64 %loaded to i64 (i64)*
  %ret_val = tail call i64 %as_function(i64 0)
  ret i64 %ret_val
}

attributes #0 = { nounwind }

I realize that my problem can be solved by adding some intrinsic which at codegen level would unfold to simple movabs reg, imm64. But I'd like to have a temporary solution for the time being.

The question: is it possible to make an opaque constant in llvm which doesn't get constant-folded?

My llvm version is 3.7.0svn.

Vladislav Ivanishin
  • 2,092
  • 16
  • 22
  • 2
    If the value's going to change, does it really make sense to use a constant? Maybe use a global variable with a constant initializer instead? – Ismail Badawi Aug 03 '15 at 19:41
  • @IsmailBadawi, good thinking. That would solve it, but performance is a concern (matter of fact it is the ultimate goal) here. Though I am not sure if those extra loads gonna make a real difference if we take LICM into account. Probably I'll try the globals/constant pool if there's no simpler way. – Vladislav Ivanishin Aug 03 '15 at 20:51
  • 2
    Patching the code seems error prone to me. How do you find the 1000000, and how do you know that some 1000000 is something you really want to patch vs. some artifact of code generation completely independent of your constant and used for other purposes (like a bit mask)? Then again, if you knew somehow exactly what to patch then you could just let the compiler fold constants and patch the constant folded results... – Erik Eidt Aug 03 '15 at 22:46
  • @ErikEidt, I remember the addresses (they are always made 64-bit) I might want to patch -- no problem here. Later, when the native code has been emitted I disassemble it (with MC disassembler), look for instructions having imm64 as an operand (AFAIK there are only 2 in the x64 instruction set) and check whether the imm is one of the constants I track. Your second observation is true: if there's a bit mask equal to one of my addresses I'm doomed. I haven't encountered bit masks this large doing my lowering to LLVM IR yet, but I see how fragile the assumption is. – Vladislav Ivanishin Aug 04 '15 at 06:37
  • @ErikEidt, I think, an LLVM intrinsic which simply moves const to a register and remembers the offset from the function start (like stackmap/patchpoint intrinsics do) would be ideal. No problem with bitmasks, etc. and no excessive loads. I just don't know how much time it'll take me to implement, so I am looking for a temporary solution. – Vladislav Ivanishin Aug 04 '15 at 06:44

1 Answers1

2

No, it's not possible. Your best bet is to use an external global variable as was mentioned in the comments. In fact, for your purposes it might be exactly what you want to do since at that point your jittable code will get a relocation for what you actually want and patched up accordingly at execution time by rtdyld.

If you want an actual constant for the jitted code (e.g. to call a particular address that you know about) then what you're doing is just fine.

echristo
  • 1,687
  • 10
  • 8
  • Okay.. :) The problem with global variables is they also have to be allocated somewhere. If it's the heap, the VM's GC should take control over them. The jitted code can later be discarded, so should be the relevent globals. Though it's manageable I guess. I've discovered the `llvm.experimental.gc.relocate` intrinsic and friends. Will try to go with them with globals as a backup plan. Thanks for your help! – Vladislav Ivanishin Aug 04 '15 at 17:49
  • Right. I was expected that they were actually just external constants that resided somewhere, otherwise you might be able to pass them in as arguments to your function. If they're truly constants you might be able to have an optimization pass that runs at jitting time on top that replaces some "named variables" that you have in your program with the correct values. – echristo Aug 05 '15 at 18:22