136

Firstly, I apologize for the length of this question.

I am the author of IronScheme. Recently I have been working hard on emitting decent debug info, so that I can use the 'native' .NET debugger.

While this has been partly successful, I am running into some teething problems.

The first problem is related to stepping.

Due to Scheme being an expression language, everything tends to be wrapped in parenthesis, unlike the major .NET languages which seems to be statement (or line) based.

The original code (Scheme) looks like:

(define (baz x)
  (cond
    [(null? x) 
      x]
    [(pair? x) 
      (car x)]
    [else
      (assertion-violation #f "nooo" x)]))

I have on purpose laid out each expression on a newline.

The emitted code transforms to C# (via ILSpy) looks like:

public static object ::baz(object x)
{
  if (x == null)
  {
    return x;
  }
  if (x is Cons)
  {
    return Builtins.Car(x);
  }
  return #.ironscheme.exceptions::assertion-violation+(
     RuntimeHelpers.False, "nooo", Builtins.List(x));
}

As you can see, pretty simple.

Note: If the code was transformed into a conditional expression (?:) in C#, the whole thing would just be one debug step, keep that in mind.

Here is IL output with source and line numbers:

  .method public static object  '::baz'(object x) cil managed
  {
    // Code size       56 (0x38)
    .maxstack  6
    .line 15,15 : 1,2 ''
//000014: 
//000015: (define (baz x)
    IL_0000:  nop
    .line 17,17 : 6,15 ''
//000016:   (cond
//000017:     [(null? x) 
    IL_0001:  ldarg.0
    IL_0002:  brtrue     IL_0009

    .line 18,18 : 7,8 ''
//000018:       x]
    IL_0007:  ldarg.0
    IL_0008:  ret

    .line 19,19 : 6,15 ''
//000019:     [(pair? x) 
    .line 19,19 : 6,15 ''
    IL_0009:  ldarg.0
    IL_000a:  isinst [IronScheme]IronScheme.Runtime.Cons
    IL_000f:  ldnull
    IL_0010:  cgt.un
    IL_0012:  brfalse    IL_0020

    IL_0017:  ldarg.0
    .line 20,20 : 7,14 ''
//000020:       (car x)]
    IL_0018:  tail.
    IL_001a:  call object [IronScheme]IronScheme.Runtime.Builtins::Car(object)
    IL_001f:  ret

    IL_0020:  ldsfld object 
         [Microsoft.Scripting]Microsoft.Scripting.RuntimeHelpers::False
    IL_0025:  ldstr      "nooo"
    IL_002a:  ldarg.0
    IL_002b:  call object [IronScheme]IronScheme.Runtime.Builtins::List(object)
    .line 22,22 : 7,40 ''
//000021:     [else
//000022:       (assertion-violation #f "nooo" x)]))
    IL_0030:  tail.
    IL_0032:  call object [ironscheme.boot]#::
       'ironscheme.exceptions::assertion-violation+'(object,object,object)
    IL_0037:  ret
  } // end of method 'eval-core(033)'::'::baz'

Note: To prevent the debugger from simply highlighting the entire method, I make the method entry point just 1 column wide.

As you can see, each expression maps correctly to a line.

Now the problem with stepping (tested on VS2010, but same/similar issue on VS2008):

These are with IgnoreSymbolStoreSequencePoints not applied.

  1. Call baz with null arg, it works correctly. (null? x) followed by x.
  2. Call baz with Cons arg, it works correctly. (null? x) then (pair? x) then (car x).
  3. Call baz with other arg, it fails. (null? x) then (pair? x) then (car x) then (assertion-violation ...).

When applying IgnoreSymbolStoreSequencePoints (as recommended):

  1. Call baz with null arg, it works correctly. (null? x) followed by x.
  2. Call baz with Cons arg, it fails. (null? x) then (pair? x).
  3. Call baz with other arg, it fails. (null? x) then (pair? x) then (car x) then (assertion-violation ...).

I also find in this mode that some lines (not shown here) are incorrectly highlighted, they are off by 1.

Here are some ideas what could be the causes:

  • Tailcalls confuses the debugger
  • Overlapping locations (not shown here) confuses the debugger (it does so very well when setting a breakpoint)
  • ????

The second, but also serious, issue is the debugger failing to break/hit breakpoints in some cases.

The only place where I can get the debugger to break correctly (and consistantly), is at the method entry point.

The situation gets a bit better when IgnoreSymbolStoreSequencePoints is not applied.

Conclusion

It might be that the VS debugger is just plain buggy :(

References:

  1. Making a CLR/.NET Language Debuggable

Update 1:

Mdbg does not work for 64-bit assemblies. So that is out. I have no more 32-bit machines to test it on. Update: I am sure this is no big problem, does anyone have a fix? Edit: Yes, silly me, just start mdbg under the x64 command prompt :)

Update 2:

I have created a C# app, and tried to dissect the line info.

My findings:

  • After any brXXX instruction you need to have a sequence point (if not valid aka '#line hidden', emit a nop).
  • Before any brXXX instruction, emit a '#line hidden' and a nop.

Applying this, does not however fix the issues (alone?).

But adding the following, gives the desired result :)

  • After ret, emit a '#line hidden' and a nop.

This is using the mode where IgnoreSymbolStoreSequencePoints is not applied. When applied, some steps are still skipped :(

Here is the IL output when above has been applied:

  .method public static object  '::baz'(object x) cil managed
  {
    // Code size       63 (0x3f)
    .maxstack  6
    .line 15,15 : 1,2 ''
    IL_0000:  nop
    .line 17,17 : 6,15 ''
    IL_0001:  ldarg.0
    .line 16707566,16707566 : 0,0 ''
    IL_0002:  nop
    IL_0003:  brtrue     IL_000c

    .line 16707566,16707566 : 0,0 ''
    IL_0008:  nop
    .line 18,18 : 7,8 ''
    IL_0009:  ldarg.0
    IL_000a:  ret

    .line 16707566,16707566 : 0,0 ''
    IL_000b:  nop
    .line 19,19 : 6,15 ''
    .line 19,19 : 6,15 ''
    IL_000c:  ldarg.0
    IL_000d:  isinst     [IronScheme]IronScheme.Runtime.Cons
    IL_0012:  ldnull
    IL_0013:  cgt.un
    .line 16707566,16707566 : 0,0 ''
    IL_0015:  nop
    IL_0016:  brfalse    IL_0026

    .line 16707566,16707566 : 0,0 ''
    IL_001b:  nop
    IL_001c:  ldarg.0
    .line 20,20 : 7,14 ''
    IL_001d:  tail.
    IL_001f:  call object [IronScheme]IronScheme.Runtime.Builtins::Car(object)
    IL_0024:  ret

    .line 16707566,16707566 : 0,0 ''
    IL_0025:  nop
    IL_0026:  ldsfld object 
      [Microsoft.Scripting]Microsoft.Scripting.RuntimeHelpers::False
    IL_002b:  ldstr      "nooo"
    IL_0030:  ldarg.0
    IL_0031:  call object [IronScheme]IronScheme.Runtime.Builtins::List(object)
    .line 22,22 : 7,40 ''
    IL_0036:  tail.
    IL_0038:  call object [ironscheme.boot]#::
      'ironscheme.exceptions::assertion-violation+'(object,object,object)
    IL_003d:  ret

    .line 16707566,16707566 : 0,0 ''
    IL_003e:  nop
  } // end of method 'eval-core(033)'::'::baz'

Update 3:

Problem with above 'semi-fix'. Peverify reports errors on all methods due to the nop after ret. I dont understand the problem really. How can a nop break verification after a ret. It is like dead code (except that it is NOT even code) ... Oh well, experimentation continues.

Update 4:

Back at home now, removed the 'unverifiable' code, running on VS2008 and things are a lot worse. Perhaps running unverifiable code for the sake of proper debugging might be the answer. In 'release' mode, all output would still be verifiable.

Update 5:

I have now decided my above idea is the only viable option for now. Although the generated code is unverifiable, I have yet to find any VerificationException's. I dont know what the impact will be on the end user with this scenario.

As a bonus, my second issue has also be solved. :)

Here is a little screencast of what I ended up with. It hits breakpoints, does proper stepping (in/out/over), etc. All in all, the desired effect.

I, however, am still not accepting this as the way to do it. It feel overly-hacky to me. Having a confirmation on the real issue would be nice.

Update 6:

Just had the change to test the code on VS2010, there seems to be some problems:

  1. The first call now does not step correctly. (assertion-violation ...) is hit. Other cases works fine. Some old code emitted unnecessary positions. Removed the code, works as expected. :)
  2. More seriously, breakpoints fail on the second invocation of the program (using in-memory compilation, dumping assembly to file seems to make breakpoints happy again).

Both these cases work correctly under VS2008. The main difference is that under VS2010, the entire application is compiled for .NET 4 and under VS2008, compiles to .NET 2. Both running 64-bit.

Update 7:

Like mentioned, I got mdbg running under 64-bit. Unfortunately, it also have the breakpoint issue where it fails to break if I rerun the program (this implies it gets recompiled, so not using the same assembly, but still using the same source).

Update 8:

I have filed a bug at the MS Connect site regarding the breakpoint issue.

Update: Fixed

Update 9:

After some long thinking, the only way to make the debugger happy seems to be doing SSA, so every step can be isolated and sequential. I am yet to prove this notion though. But it seems logical. Obviously, cleaning up temps from SSA will break debugging, but that is easy to toggle, and leaving them does not have much overhead.

Community
  • 1
  • 1
leppie
  • 115,091
  • 17
  • 196
  • 297
  • Any ideas welcome, I will try them out and append the results to the question. First 2 ideas, try mdbg and write the same code in C# and compare. – leppie Aug 04 '11 at 06:44
  • 6
    What's the question? – George Stocker Aug 05 '11 at 14:10
  • 9
    I think his question is along the lines of "why isn't my language stepping correctly?" – McKay Aug 05 '11 at 14:28
  • 1
    If you don't pass peverify you wont be able to run in partial trust situations, like from a mapped drive or in many plugin hosting setups. How are you generating your IL, reflection.Emit or via text + ilasm? In either case you can always put in the .line 16707566,16707566 : 0,0 '' yourself that way you dont have to worry about them putting a nop after the return. – Hippiehunter Aug 05 '11 at 17:24
  • @Hippiehunter: Thanks. Unfortunately without the `nop`s, the stepping fails (I will verify this again to be sure). It is a sacrifice I guess I would have to make. It is not like VS can even run without administrator rights :) Btw using Reflection.Emit via the DLR (a very hacked early branched one). – leppie Aug 05 '11 at 17:52
  • Are you using the default expression evaluator? Also have you tried applying the line directive to the ret instead of a nop? – Hippiehunter Aug 05 '11 at 19:00
  • Also according to this http://blogs.msdn.com/b/clrcodegeneration/archive/2009/05/11/tail-call-improvements-in-net-framework-4.aspx It looks like you should be able to put a nop after the tail call before the ret, thus giving you somewhere to put the hidden line directive – Hippiehunter Aug 05 '11 at 19:57
  • FYI: http://programmers.stackexchange.com/questions/100417/how-does-the-dlr-compare-to-c-4-0 – Robert Harvey Aug 11 '11 at 16:53
  • I don't see any mentions of pdb's... I thought the pdb was what told the debugger how to map the IL onto the original code? – justin.m.chase Aug 11 '11 at 17:15
  • @justin.m.chase: PDB's are generated and I currently have to rely on its sequence points (see `IgnoreSymbolStoreSequencePoints` which I disable). If the debugger uses the IL sequence points (iow applying `IgnoreSymbolStoreSequencePoints`), it goes off by 1. – leppie Aug 11 '11 at 17:36
  • @Hippiehunter: I will try that, but my code will need some modification, and given the strictness of .NET 4, it might not be verifiable. Still not sure it will solve the stepping issue. Currently .NET 4 (and 2) seems to be at least happy with having unverifiable code in unreachable locations. Thanks for your feedback. – leppie Aug 11 '11 at 17:41
  • Here's another mention of IronScheme: http://programmers.stackexchange.com/questions/107368 – Robert Harvey Sep 11 '11 at 04:11
  • @Robert Harvey: Thanks for the ping :) – leppie Sep 11 '11 at 06:43

2 Answers2

26

I am an engineer on the Visual Studio Debugger team.

Correct me if I am wrong, but it sounds like the only issue left is that when switching from PDBs to the .NET 4 dynamic compile symbol format some breakpoints are being missed.

We would probably need a repro to exactly diagnose the issue, however here are some notes that might help.

  1. VS (2008+) can-to run as a non-admin
  2. Do any symbols load at all the second time around? You might test by breaking in (through exception or call System.Diagnostic.Debugger.Break())
  3. Assuming that symbols load, is there a repro that you could send us?
  4. The likely difference is that the symbol format for dynamic-compiled code is 100% different between .NET 2 (PDB stream) and .NET 4 (IL DB I think they called it?)
  5. The 'nop's sound about right. See rules for generating implicit sequence points below.
  6. You don't actually need to emit things on different lines. By default, VS will step 'symbol-statements' where, as the compiler writer you get to define what 'symbol-statement' means. So if you want each expression to be a separate thing in the symbol file, that will work just fine.

The JIT creates an implicit sequence point based on the following rules: 1. IL nop instructions 2. IL stack empty points 3. The IL instruction immediately following a call instruction

If it turns out we do need a repro to solve your issue, you can file a connect bug and upload files securely through that medium.

Update:

We are encouraging other users experiencing this issue to try the Developer Preview of Dev11 from http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=27543 and comment with any feedback. (Must target 4.5)

Update 2:

Leppie has verified the fix to work for him on the Beta version of Dev11 available at http://www.microsoft.com/visualstudio/11/en-us/downloads as noted in the connect bug https://connect.microsoft.com/VisualStudio/feedback/details/684089/.

Thanks,

Luke

Luke Kim
  • 1,046
  • 10
  • 11
  • Thanks a lot. I will try make repro, if needed. – leppie Aug 13 '11 at 05:58
  • As for the confirmation, yes you are correct, but I would still prefer to solve the stepping issues without breaking verifiability. But as said, I am willing to sacrifice that if I could possibly prove that it will never throw a runtime `VerificationException`. – leppie Aug 13 '11 at 06:26
  • As for point 6, I simply made it 'line-based' for this question. The debugger does in fact step correctly in/out/over expressions like I intend it to work :) The only problem with approach is setting breakpoints on the 'inner' expressions if an 'outer' expression covers it; the debugger in VS tends to want to make the most outer expression as the breakpoint. – leppie Aug 13 '11 at 06:34
  • I think I have isolated the breakpoint issue. While running under CLR2 the breakpoints seems to be re-evaluated when running the code again, albeit new code. On CLR4, the breakpoints only 'stick' to the original code. I have made a small screencast of the CLR4 behavior @ http://screencast.com/t/eiSilNzL5Nr. Notice that the type name changes between invocations. – leppie Aug 13 '11 at 09:35
  • I have filed a bug on connect. The repro is quite easy :) https://connect.microsoft.com/VisualStudio/feedback/details/684089/ – leppie Aug 14 '11 at 13:21
  • Marking as answer in the meanwhile. – leppie Aug 15 '11 at 10:09
  • Thanks for filing the bug. We'll take a look at it internally, however it might be a few days. – Luke Kim Aug 16 '11 at 01:03
  • No problem, I suspect an issue in the debugger 'section' as MDBG showed the same behavior. – leppie Aug 16 '11 at 04:29
  • Just to close this thread so you know I didn't disappear... The Connect ticket will be updated with future findings. We have found the actual issue, but it's not from my team, so there could be some delay with the Connect bug being updated. – Luke Kim Aug 18 '11 at 05:07
  • If you can, I'd now recommend you try the developer preview build http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=27543. Thanks. – Luke Kim Sep 27 '11 at 00:37
  • Thanks, I did, but the new 'remote debugger' (using pseudo remote) is too slow to even start my app without debug symbols. This is targeting .NET 4.5. :( I'll wait for the RC :) (Can you change the default back to the 'old debugger'?) – leppie Sep 27 '11 at 04:23
  • Interesting. Unlike this .NET issue, the Remote Debugger is something I actually work on. I am interested as to why it is slow for you, especially considering this is pseudo remote. I'll try to follow up through your CodePlex contact tomorrow about the performance issue. In the meantime, perhaps you could try x86? – Luke Kim Sep 27 '11 at 04:37
  • IIRC, the hosting debugger process, was taking up all the CPU resources, leaving nothing for the app. I will create a repro tonite. Thanks :) – leppie Sep 27 '11 at 06:51
3

I am an engineer on the SharpDevelop Debugger team :-)

Did you solve the problem?

Did you try to debug it in SharpDevelop? If there is a bug in .NET, I wonder if we need to implement some workaround. I am not aware of this issue.

Did you try to debug it in ILSpy? Especially without debug symbols. It would debug C# code, but it would tell us if the IL instructions are nicely debugable. (Mind that ILSpy debugger is beta though)

Quick notes on the original IL code:

  • .line 19,19 : 6,15 '' occurs twice?
  • .line 20,20 : 7,14 '' does not start on implicit sequence point (stack is not empty). I am worried
  • .line 20,20 : 7,14 '' includes the code for "car x" (good) as well as "#f nooo x" (bad?)
  • regarding the nop after ret. What about stloc, ldloc, ret? I think C# uses this trick to make ret a distinct sequence point.

David

dsrbecky
  • 151
  • 1
  • 4
  • +1 Thanks for the feedback, as the first point, unfortunate side-effect from the code generator, but seems harmless. Second point, this is exactly what I need, but I see your point, will look into it. Not sure what you mean by the last point. – leppie Sep 12 '11 at 04:13
  • Third point was that .line 22,22 : 7,40 '' should be before IL_0020. Or there should be something before IL_0020, otherwise the code still counts as .line 20,20 : 7,14 ''. The fourth point is "ret, nop" might be replaced with "sloc, .line, ldloc, ret". I have seen the pattern before, maybe it was redundant, but maybe it had a reason. – dsrbecky Sep 13 '11 at 21:34
  • I cant do the stloc, ldloc before ret, as I will loose the tail call. Ah I get you, you were referring to the original output? – leppie Sep 14 '11 at 06:49