54

Update: XE2 Update 2 fixes the bug described below.

The program below, cutdown from the real program, fails with an exception in XE2. This is a regression from 2010. I don't have XE to test on but I'd expect that the program works fine on XE (thanks to Primož for confirming that the code runs fine on XE).

program COMbug;

{$APPTYPE CONSOLE}

uses
  SysUtils, Variants, Windows, Excel2000;

var
  Excel: TExcelApplication;
  Book: ExcelWorkbook;
  Sheet: ExcelWorksheet;
  UsedRange: ExcelRange;
  Row, Col: Integer;
  v: Variant;

begin
  Excel := TExcelApplication.Create(nil);
  try
    Excel.Visible[LOCALE_USER_DEFAULT] := True;
    Book := Excel.Workbooks.Add(EmptyParam, LOCALE_USER_DEFAULT) as ExcelWorkbook;
    Sheet := Book.Worksheets.Add(EmptyParam, EmptyParam, 1, EmptyParam, LOCALE_USER_DEFAULT) as ExcelWorksheet;

    Sheet.Cells.Item[1,1].Value := 1.0;
    Sheet.Cells.Item[2,2].Value := 1.0;
    UsedRange := Sheet.UsedRange[LOCALE_USER_DEFAULT] as ExcelRange;
    for Row := 1 to UsedRange.Rows.Count do begin
      for Col := 1 to UsedRange.Columns.Count do begin
        v := UsedRange.Item[Row, Col].Value;
      end;
    end;
  finally
    Excel.Free;
  end;
end.

In XE2 32 bit the error is:

Project COMbug.exe raised exception class $C000001D with message 'system exception (code 0xc000001d) at 0x00dd6f3e'.

The error occurs on the second execution of UsedRange.Columns.

In XE2 64 bit the error is:

Project COMbug.exe raised exception class $C0000005 with message 'c0000005 ACCESS_VIOLATION'

Again, I think that the error occurs on the second execution of UsedRange.Columns, but the 64 bit debugger steps through the code in a slightly weird way so I'm not 100% sure of that.

I have submitted a QC report for the issue.

I looks very much to me as though something in the Delphi COM/automation/interface stack is comprehensively broken. This is a complete show-stopper for my XE2 adoption.

Does anyone have any experience of this problem? Does anyone have any tips and advice as to how I might attempt to work around the problem? Debugging what's really going on here is outside my area of expertise.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490

2 Answers2

82

Workaround

rowCnt := UsedRange.Rows.Count;
colCnt := UsedRange.Columns.Count;
for Row := 1 to rowCnt do begin
  for Col := 1 to colCnt do begin
    v := UsedRange.Item[Row, Col].Value;
  end;
end;

This also works (and may help you find a workaround in more complicated use cases):

function ColCount(const range: ExcelRange): integer;
begin
  Result := range.Columns.Count;
end;

for Row := 1 to UsedRange.Rows.Count do begin
  for Col := 1 to ColCount(UsedRange) do begin
    v := UsedRange.Item[Row, Col].Value;
  end;
end;

Analysis

It crashes in System.Win.ComObj in DispCallByID when executing _Release in

varDispatch, varUnknown:
  begin
    if PPointer(Result)^ <> nil then
      IDispatch(Result)._Release;
    PPointer(Result)^ := Res.VDispatch;
  end;

Although the PUREPASCAL version of this same procedure in Delphi XE (XE uses an assembler version) is different ...

varDispatch, varUnknown:
  begin
    if PPointer(Result)^ <> nil then
      IDispatch(Result.VDispatch)._Release;
    PPointer(Result)^ := Res.VDispatch;
  end;

... the assembler code in both cases is the same (EDIT: not true, see my notes at the end):

@ResDispatch:
@ResUnknown:
        MOV     EAX,[EBX]
        TEST    EAX,EAX
        JE      @@2
        PUSH    EAX
        MOV     EAX,[EAX]
        CALL    [EAX].Pointer[8]
@@2:    MOV     EAX,[ESP+8]
        MOV     [EBX],EAX
        JMP     @ResDone

Interestingly enough, this crashes ...

for Row := 1 to UsedRange.Rows.Count do begin
  for Col := 1 to UsedRange.Columns.Count do begin
  end;
end;

... and this doesn't.

row := UsedRange.Rows.Count;
col := UsedRange.Columns.Count;
col := UsedRange.Columns.Count;

The reason for this is the use of hidden local variables. In the first example, the code compiles to ...

00564511 6874465600       push $00564674
00564516 6884465600       push $00564684
0056451B A12CF35600       mov eax,[$0056f32c]
00564520 50               push eax
00564521 8D8508FFFFFF     lea eax,[ebp-$000000f8]
00564527 50               push eax
00564528 E8933EEAFF       call DispCallByIDProc

... and that is called twice.

In the second example, two different temporary locations on the stack are used (ebp - ???? offsets):

00564466 6874465600       push $00564674
0056446B 6884465600       push $00564684
00564470 A12CF35600       mov eax,[$0056f32c]
00564475 50               push eax
00564476 8D8514FFFFFF     lea eax,[ebp-$000000ec]
0056447C 50               push eax
0056447D E83E3FEAFF       call DispCallByIDProc
...
0056449B 6874465600       push $00564674
005644A0 6884465600       push $00564684
005644A5 A12CF35600       mov eax,[$0056f32c]
005644AA 50               push eax
005644AB 8D8510FFFFFF     lea eax,[ebp-$000000f0]
005644B1 50               push eax
005644B2 E8093FEAFF       call DispCallByIDProc

The bug occurs when an internal interface stored in this temporary location is being cleared, which happens only when the "for" case is executed for the second time because there's something already stored in this interface - it was put there when "for" was called for the first time. In the second example, two locations are used so this internal interface is always initialized to 0 and Release is not called at all.

The true bug is that this internal interface contains garbage and when Release is called, sh!t happens.

After some more digging, I noticed that the assembler code that frees the old interface is not the same - XE2 version is missing one "mov eax, [eax]" instruction. IOW,

IDispatch(Result)._Release;

is a mistake and it really should be

IDispatch(Result.VDispatch)._Release;

Nasty RTL bug.

gabr
  • 26,580
  • 9
  • 75
  • 141
  • 2
    This is 100% a compiler regression and needs to be in Quality Central. – Warren P Oct 25 '11 at 12:18
  • @mad hatter I don't think it's as bad as you say. FireMonkey is a public alpha but the rest of the product is pretty robust. – David Heffernan Oct 25 '11 at 13:02
  • 11
    The compiler is a pretty amazing piece of work. It has bugs. It always will. But it's still the most incredible language, and compiler technology available. (* NO I don't work for embarcadero. *) – Warren P Oct 25 '11 at 13:12
  • 2
    @gabr Thank you so much for you excellent work here. I think I have enough now to work around the bug. Let's hope that Emba fix this in the next XE2 update. – David Heffernan Oct 25 '11 at 15:02
  • 1
    IMHO Embarcadero should handle far better its beta cycle. I guess MS didn't open its beta just for fun or to showcase products. That way I guess they get far more feedbacks about issues than with a closed beta. XE2 adds a lot, but as long as you hit too many bugs the value is low. – Mad Hatter Oct 25 '11 at 16:03
  • @Mad Hatter they would not let me onto their xe2 beta program. Can't imagine why not. – David Heffernan Oct 25 '11 at 19:57
  • @gabr In your opinion, how widespread is the bug? Is it all interfaces or just dispatch interfaces? – David Heffernan Oct 25 '11 at 19:58
  • 4
    On a point of technical accuracy, isn't this an RTL bug rather than a compiler bug? And how on earth did it happen? Why was that code changed and how was it allowed to be changed in a way that introduced such a serious bug? Inquiring minds need to know. – Deltics Oct 25 '11 at 20:53
  • 1
    @David, I believe this is related only to dispatch interfaces. – gabr Oct 25 '11 at 21:38
  • @David - "Wouldn't let you on the XE2 beta program" ? Was your credit card declined ? ;) – Deltics Oct 26 '11 at 00:23
  • 1
    @gabr - yes I saw that was how you characterised it, I was concerned that somehow between your answer and the comments it had transmogrified into a compiler issue. A rose by any other name and all that, but in my dealing with "Quality" Central I have learned that failure to hit the nail squarely and pedantically on the head can result in a report being summarily dismissed. :) – Deltics Oct 26 '11 at 00:25
  • 1
    Embarcadero preferred to use the beta cycle with marketing aims (buy XE, increase our revenues and let us match our sales target for a release you don't need) instead of technical ones (this a new compiler, let's test it as mush as we can to clean out bugs so we can deliver a high quality product), and this is the result. –  Oct 27 '11 at 13:54
  • 5
    @Deltics "Wouldn't let you on the XE2 beta program". Apparently they gave priority to registered XE users and I wasn't. Quite why they didn't want to let me test and debug their product for free is beyond me. I would have found this bug (and many others). Makes no sense to me at all. – David Heffernan Oct 28 '11 at 15:33
1

Most of this is getting above my head but I did wonder if a call to CoInitialize would be required. Searching the web for CoInitialize returned this page:

http://chrisbensen.blogspot.com/2007/06/delphi-tips-and-tricks.html

It almost looks like that page describes the problem from the OP and gabr's analysis as it relates to a call to .Release. Moving the functionality of the code into it's own procedure might help. I don't have XE or XE2 to test with.

Edit: Rats - meant to add this as a comment to the above.

  • Hmm, I did miss that in my simple example, thanks. It doesn't matter though, perhaps because Excel is running out-of-proc. Even when you add it the same behaviour occurs. And of course the original code from which this example was culled has COM initialized correctly. – David Heffernan Oct 26 '11 at 08:54