9

In Delphi function result is frequently implemented as var-parameter (not out-parameter despite QC ticket).

String constants are basically variables with negative refcounter, which should suppress automatic memory [de]allocation. http://docwiki.embarcadero.com/RADStudio/XE3/en/Internal_Data_Formats#Long_String_Types

It really does suppress it: the code below does not leak.

type
  TDealRecord = record
    id_Type: Integer;
    Price: extended;
    Remark: String;
  end;
const const_loop = 100000000;

function TestVar: TDealRecord;
//procedure TestVar;
var
  Li: Integer;
  LRec: TDealRecord;
begin
  for Li := 1 to const_loop do begin
     FillChar(Lrec,SizeOf(LRec), 0);
     LRec.Remark := 'Test';

//     FillChar(Result,SizeOf(Result), 0);
//     Result.Remark := 'Test';
  end;
end;

But change the manipulated variable - and it immediately starts to leak heavily.

function TestVar: TDealRecord;
//procedure TestVar;
var
  Li: Integer;
  LRec: TDealRecord;
begin
  for Li := 1 to const_loop do begin
//     FillChar(Lrec,SizeOf(LRec), 0);
//     LRec.Remark := 'Test';

     FillChar(Result,SizeOf(Result), 0);
     Result.Remark := 'Test';
  end;
end;

It turns out that string := const is implemented with different calls, depending on LValue:

  1. Result: AnsiString -> LStrAsg
  2. Result: UnicodeString: -> UStrAsg
  3. Local var: UnicodeString: -> UStrLAsg
  4. Local var: AnsiString: -> LStrLAsg

And while the latter two are cloning pointer as expected, the former two are copying the string to new instance, like if i add UniqueString call to them.

Why that difference ?

Arioch 'The
  • 15,799
  • 35
  • 62
  • It's presumably due to the difference between variables with local scope, and variables with scope outside the function in which the assignment is made. When you assign to the `Result` variable, you are assigning to a variable that is visible outside that function. When you assign to the local, it is private to that function. – David Heffernan Oct 11 '12 at 10:45
  • 1
    @David, yes, yes, but "so what?" ! why it is prohibited to make outer variable point to "-1" constant ? when (if) needed it would be separated on demand. And doing it pro-actively in the function looks liek premature optimization and also introduces radically different behavior when "common sense" would tell no difference. – Arioch 'The Oct 11 '12 at 11:14
  • 1
    I can only think that compiler cannot track what it assigns "from". It thinks that it assigned not global constant but an internal expression on stack, that would be lost. Then immediate cloning makes sense. But that seems to be so rough, so "global" rule that it is rather inefficient. – Arioch 'The Oct 11 '12 at 11:15

2 Answers2

9

In Delphi, constant strings are always copied when assigned to another global variable, but not to a local variable, to avoid access violation in some borderline cases.

Use the source, Luke!

See this code extraction from System.pas:

{ 99.03.11
  This function is used when assigning to global variables.

  Literals are copied to prevent a situation where a dynamically
  allocated DLL or package assigns a literal to a variable and then
  is unloaded -- thereby causing the string memory (in the code
  segment of the DLL) to be removed -- and therefore leaving the
  global variable pointing to invalid memory.
}
procedure _LStrAsg(var dest; const source);
var
  S, D: Pointer;
  P: PStrRec;
  Temp: Longint;
begin
  S := Pointer(source);
  if S <> nil then
  begin
    P := PStrRec(Integer(S) - sizeof(StrRec));
    if P.refCnt < 0 then   // make copy of string literal
    begin
      Temp := P.length;
      S := _NewAnsiString(Temp);
      Move(Pointer(source)^, S^, Temp);
      P := PStrRec(Integer(S) - sizeof(StrRec));
    end;
    InterlockedIncrement(P.refCnt);
  end;
....

So in short, by design, and to avoid access violations when a DLL or package is unloaded and did contain some constant values sent back to the main process, a local copy is always made.

You have two functions:

  • LStrAsg or UStrAsg which is generated by the compiler when a string has a chance to be a constant - this is the code above;
  • LStrLAsg or UStrLAsg (added L stands for "local") which is generated by the compiler when the source string is local, so has no be a constant: in this case, P.refCnt < 0 won't be checked, so it will be faster than upper code.
Arnaud Bouchez
  • 42,305
  • 3
  • 71
  • 159
  • 1
    Arnaud - look at _UStrLAsg and _LStrLAsg. They are also used to copy constant to string. And they DO NOT clone them. While your _LStrAsg MAY be used sometimes, sometimes Delphi uses another function with different behavior. And WHY that split happens for what looks basically the same actions, is the heart of the question. – Arioch 'The Oct 11 '12 at 12:01
  • 1
    DLL/BPL unloading is fair point, but irrelevant to const issue. BPL-local constant WOULD be unloaded with its BPL. So thanks for the hint, yet your answer is not correct. I actually showed the code that (at least in XE2) subverts *UStrLAsg... when the source string is local, so has no be a constant* idea - it is actualyl called for string constant, both named or literal. – Arioch 'The Oct 11 '12 at 12:06
  • 1
    More so, imagine such a routine: `procedure XXX; var s: string; h: THandle; begin h := LoadPackage('PK1.BPL'); s := PK1.Unit1.Const1; UnloadPackage(h); ShowMessage(s); end;` This time DLL would be unloaded and string pointer to const become stray one. I don't think Delphi would detect this and protect me by cloning the string automatically. – Arioch 'The Oct 11 '12 at 12:08
  • 1
    `P.refCnt < 0 won't be checked` is also actually incorrect. It would be checked always, but if it is negative - then `InterlockedIncrement` would be skipped, which would actually add speed in multicore environments. – Arioch 'The Oct 11 '12 at 12:12
  • 1
    Resume: is your quote Delphi RTL sources or FreePascal one ? If Delphi then which version ? It seems like interesting look into some other Pascal compiler, but not the Delphi. – Arioch 'The Oct 11 '12 at 12:12
  • @Arioch'The Your code sample is highly hypothetical but would likely break the current implementation, you are right. `UStrLAsg` is indeed to be used when the source string is local - `L` here stands for *local source*. I do not find out any issue here, according to the comment found in my Delphi 7 `System.pas` - code in XE2/XE3 is still the same, and the comment was not removed. For `_UStrAsg` it states: `// globals (need copy)`. Still the same logic. – Arnaud Bouchez Oct 11 '12 at 12:13
  • 1
    Oh, i see. Delphi RTL before FastCode integration. Should blow dust from my Delphi 5 box - if _LStrLAsg really did not cared about refcounting, which i truly doubt. Consider `UniqueString(global-var); local-var := global-var; global-var := EmptyStr;` if local assignment ignores refcounting, wouldn't it cause here premature freeing of the very value ? – Arioch 'The Oct 11 '12 at 12:18
  • @Arioch'The What I wrote is `P.refCnt < 0 won't be checked` in `LStrLAsg` or `UStrLAsg`: please check the source code of System.pas if you find my answer not clear enough. – Arnaud Bouchez Oct 11 '12 at 12:18
  • @ArnaudBouchez Well done. In this case it clearly pays to have an old version of Delphi at hand! ;-) – David Heffernan Oct 11 '12 at 12:18
  • @Arioch'The I wrote that it is still the same in Delphi XE2 / XE3. Comment is still there. FastCode did not change a line of those functions. Our [Enhanced RTL versions](http://blog.synopse.info/post/2010/01/18/Enhanced-System-Run-Time-for-Delphi-7-and-Delphi-2007) did have some change for speed, but not Delphi RTL code, since 1999. USE THE SOURCE, LUKE! :) – Arnaud Bouchez Oct 11 '12 at 12:19
  • 1
    Surely i checked! PUREPASCAL version calls _LStrAddRef which has `if P.refcnt >= 0 then InterlockedIncrement(P.refcnt);` Similar is in x86 asm, with `LOCK INC` skipping. But the very comparison is made always. – Arioch 'The Oct 11 '12 at 12:20
  • i can't quote XE2 RTL sources in comments but you can edit your answer. If your XE2 really skips the check then show it, i'd be rather puzzled. I shown the sample, which i consider valid and which makes refcounting check mandatory even when local variables engaged. – Arioch 'The Oct 11 '12 at 12:22
  • @Arioch'The The code I show is the same in Delphi 7 up to XE3 (only some inlined function __StringRefCnt added). I suspect you are still missing the fact that there are TWO functions pairs: `LStrAsg` and `LStrLAsg` - the first will check for refcnt, the 2nd (used for local source variables), won't. Please read my answer till the end. The compiler will emit code call for one or the other, depending on the context. – Arnaud Bouchez Oct 11 '12 at 12:22
  • How can u suspect that if i mentioned them both in my question ? Well, you probably mean checking of DestVar.RefCnt while i meant SourceVar.RefCnt - perhaps that is the misunderstanding ? – Arioch 'The Oct 11 '12 at 12:24
  • *The compiler will emit code call for one or the other, depending on the context* i also wrote above that your constext definition does not match Xe2 behaviour right i nfront of me. It surely makes a choice, but the context criteria has nothing with source, only with destination. Trust me i checked that `s := '1234'` does call *StrLAsg while '1234' is obviously not local var. And if you change '1234' wit hglobal const behaviour would not change a bit! – Arioch 'The Oct 11 '12 at 12:30
  • @Arioch'The Indeed, the choice is made based on the destination, exactly as per the comment that Arnaud includes in the answer. – David Heffernan Oct 11 '12 at 14:57
  • @David which comment ? i only see "when the source string is local" – Arioch 'The Oct 11 '12 at 15:02
  • 1
    Well, my Delphi does have neither the code nor the comment. And if DLL was main and only reason - then it is doing it poorly. In my examples all the code was in DPR, so you cannot even claim those function Results to be `global` - though even that idea is spectacular. – Arioch 'The Oct 19 '12 at 08:22
2

After discussion with David Heffernan, i am starting to think that Delphi compiler just does not know what is the value it assigns to variable. Kind of "type erasure" having place. It cannot tell global constant from local on-stack variable and local string expression. It cannot tell if the source would exist after function exit happened. while we know that is string literal or global constant or anything with lifetime independent of the function execution - the compiler just loses that info. And instead it plays defensive and always cloning the value - just for the chance that it would cease to exist. I am not sure, but that looks reasonable. Though the consequences of this rough indiscriminate codegen rule are one more gotcha in Delphi :-(

Marjan Venema
  • 19,136
  • 6
  • 65
  • 79
Arioch 'The
  • 15,799
  • 35
  • 62
  • 1
    any factual comment ? look like i managed to have a personal hate-fan :-D – Arioch 'The Oct 12 '12 at 10:35
  • 2
    Sorry, I should have left a comment. I downvoted you (and am not a "personal hate-fan".) There were two reasons: (a): your answer is misleading - the compiler does know a lot of information about its variables, and the choice of which method is used internally is made based on the destination. (You're right it's not the source.) Arnaud's (upvoted) answer above goes into lots of detail about this and explains why, because there are some borderline cases. I don't think calling it a "gotcha" is correct either. – David Oct 17 '12 at 14:51
  • 1
    (b) The other answer is extremely high quality, delves into the source, lists all possible combinations and what si called when and why - ie, is a perfect answer. Yours is "I am starting to think that..." - not a high quality answer. This second reason is why I gave the downvote. – David Oct 17 '12 at 14:52
  • 1
    Thanks for explaining. (a) Arnaud's post has a bunch of good ideas, but it is factually incorrect, at least it contradicts behavior that i see at my XE2 and describe in my post. (b) the other answer seems to be deleted, dunno why. I don't recall its content now, but i remember it did not answered. My answer is certainly not good one, but i waited for a week and no more answers were made. – Arioch 'The Oct 18 '12 at 06:18
  • 1
    Arnaud's answer does not contradict facts. It is the correct answer. – David Heffernan Oct 18 '12 at 19:24
  • 1
    @David then you instead can answer my remarks in Arnaud's thread? His words about global vars are irrelevant to my examples. Source quote is of some archaeological value having nothing common with modern Delphi versions (and is again irrelevant cause again speaking of global vars). Bulletpoints maybe were true for D7 but nothing common with modern Delphi behaviour demonstrated by my examples. – Arioch 'The Oct 18 '12 at 20:50
  • In this context, global means anything that isn't a local. The result var of a function could be a global, and so the compiler has to write code on that basis. Assign the return value of your function to a global variable and there's your relevance. – David Heffernan Oct 18 '12 at 20:52
  • Well, @David, discussing Arnaud's answer in my answer sub-thread is merely offtopic. You're welcome to put certain claims in his sub-thread. Let's avoid polluting this one. – Arioch 'The Oct 19 '12 at 08:19
  • There's nothing to discuss. We've all explained what you've got wrong. It falls to you to climb down. – David Heffernan Oct 19 '12 at 08:21
  • 1
    The fact that instead of answering certain comments in proper topic you keep vague offtopic in separate irrelevant one is self-telling. – Arioch 'The Oct 19 '12 at 08:55