7

After replacing hard type casts of AnsiString to TBytes (array of string) with a ToBytes method (see below), Delphi reported no memory leaks - Free Pascal 2.6.2 however shows a leak in case the TBytes value is passed to a method with a parameter of type Pointer.

The following code leaks memory:

program project1;

{$mode delphi}

uses
  SysUtils;

function ToBytes(const AValue: AnsiString): TBytes;
begin
  SetLength(Result, Length(AValue)); // <-- leak (ine 10)
  if Length(AValue) > 0 then
    Move(AValue[1], Result[0], Length(AValue));
end;

procedure Send(P: Pointer);
begin

end;

begin
  Send(ToBytes('test'));

  SetHeapTraceOutput('heaptrace.log');
end. 

Memory leak report:

Call trace for block $001C5CC0 size 12   $00401586  TOBYTES,  line 10
of project1.lpr   $00401622  main,  line 21 of project1.lpr

If I change the Send method to take an argument of type TBytes, the memory leak disappears.

mjn
  • 36,362
  • 28
  • 176
  • 378
  • Reported in Free Pascal/Lazarus Bug Tracker http://bugs.freepascal.org/view.php?id=25825 – mjn Mar 05 '14 at 08:21

3 Answers3

4

That's a compiler bug. The managed type TBytes has reference counted lifetime. The compiler should create an implicit local variable which is assigned the array returned by ToBytes. You'll need to work around that by storing to an explicit local:

var
  Tmp: TBytes;
....
Tmp := ToBytes(...);
Send(Tmp);
David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • There is something wrong with the compiler for sure. If I remove the TBytes declaration and use SysUtils (where its defined aswell) I get a SIGSEGV at the `end.` – Stefan Glienke Mar 04 '14 at 22:48
  • @StefanGlienke I tried moving the code out of the top level `begin end.` into a function, and got a SIGSEV on process startup. Weird. – David Heffernan Mar 04 '14 at 22:49
  • Seems to be somehow related to the heaptrc unit. `program project1; uses SysUtils, heaptrc; begin end.` => SIGSEGV – Stefan Glienke Mar 04 '14 at 22:53
  • Ok, my bad, heaptrc needs to be the first unit. Putting it there or using the -gh switch works. – Stefan Glienke Mar 04 '14 at 22:57
  • @StefanGlienke many thanks for your hint about SysUtils.TBytes, I have updated the code example – mjn Mar 05 '14 at 07:40
  • @mjn You can also remove the string parameter. Relevant for the problem is just passing the result of ToBytes to the Send routine. I would report that in their bug tracker (maybe its even fixed in other branches already) – Stefan Glienke Mar 05 '14 at 07:56
1

It is probably not a bug where you think it is. The heap tracing of FPC has known issues with tracking temps (and automated types in general) in main programs (the main .dpr begin..end).

Move the code to a procedure, and call that from the main begin..end. and you'll see the leak disappears.

This because the general structure of the main program is like

begin 
  initializeunits(); // procedure call inserted by the compiler
    <actual mainprogram statements>
  finalizeunits();   // procedure call inserted by the compiler
end.

with the releasing of mainprogram temps happening at "end." after finalizeunits that finalizes the heaptracking. (even if it is first unit, it is still only an unit). So heaptrc misses that.

Marco van de Voort
  • 25,628
  • 5
  • 56
  • 89
  • Moving the code to a procedure and calling it from the main begin ... end did not fix the leak (I have added a comment with the leak report to the bug tracker item). – mjn Mar 06 '14 at 12:31
  • Seems indeed to be a problem in fixes that is already solved in trunk. I recompiled with 2.6.x and got one memleak, and 0 with 2.7.1. If I add a for around it (with 2.6.x), I have 10 leaks. Pretty serious. – Marco van de Voort Mar 06 '14 at 19:20
0

Functions that return things that potentially comprise a lot of memory make me squeamish. This may be just a feeling or personal preference, but it does take away control over the allocation of the memory. There are similar things said about function that return (freshly constructed) instances of TStringList, there too it's advisable to pass a pointer to a TStrings object and let the caller keep control over the life-time of the object.

Here I would advise to use a var parameter for all operations on TBytes, and force the caller to provide an instance to work on. If you build an elaborate application and will have to search where you can improve performance, think about this again and see if you can recycle TBytes instances of similar content or length. An important feature of the Delphi string system is that is doing reference counting and copy-on-write for you behind the scenes, providing this performance gain when your application is handling a lot of (similar) strings.

Stijn Sanders
  • 35,982
  • 11
  • 45
  • 67
  • This is more of a comment than an answer to the question that was asked. You concentrate on one issue, namely the performance implications of less than optimal heap allocations. There's another issue and that is one of composability. A function that returns a value can be used as the argument to another function. Or the right hand side of an assignment statement. On the other hand, a procedure that returns through a `var` parameter necessitates multiple separate statements, and assignments to local variables that might not otherwise be needed. TBC – David Heffernan Mar 04 '14 at 21:24
  • So, if you are concerned about code readability and clarity, then return values are invariably to be preferred, where they are practical. As an example, I have a number of numeric types in my code base. One heavily used type is a vector with 3 real elements. This is declared as a record, many functions return it, and I make extensive use of operator overloading. You use it just as you would a built in type like `Integer`, `Double`, `string` etc. The code always feels right. TBC – David Heffernan Mar 04 '14 at 21:26
  • On the other hand, I have a vector type with dimension only known at runtime. This means that the elements need to live on the heap. And there can be a great many of them. So performance becomes a real issue. I really don't want to be performing any more heap allocations than needed. For this class the methods that operate on the type are all passed ready-made instances, constructed by the caller. This meets the performance needs, but the code is unreadable tripe! So, horses for courses. – David Heffernan Mar 04 '14 at 21:29
  • Yes, I first thought of posting this as a comment, but since it would also solve the memory leak, I posted it as an answer. I see you about the composability, but I raise with make a proper object then if you want more ease of use. (And 3 real's isn't that much memory, and not of variable length). About this class you mention, why aren't the methods operating on the object itself? Also I read a lot about functions that return Self so can do chaining, (which results in pretty readable tripe IMHO) and there are other methods to keep code concise and readable. – Stijn Sanders Mar 04 '14 at 21:36
  • Exactly. Three reals is not much. And they are values so function return value works well. Why aren't the methods operating on the object itself? They are. But that's what makes the code unreadable tripe. What I really want is to be able to use assignment, and operators. I could do that with a record holding a dyn array and then COW. But the perf stops me. This question (http://stackoverflow.com/questions/21753006/how-can-i-test-if-a-type-is-safe-to-copy-using-move) I can implement it, but the perf is no good. And yes, fluent interface is usually unreadable in my view. – David Heffernan Mar 04 '14 at 21:39
  • @DavidHeffernan I would tend to agree with this answer in most cases. I would rather sacrifice a bit of code readability than have a function return a newly created object. It's even worse when that function is used as an argument for a method. Who really owns the object. There is always a better way to handle those situations. If push comes to shove and it's really needed then I would end up creating an interface. – Graymatter Mar 04 '14 at 23:45
  • @Graymatter We aren't talking about objects though. The topic of the question is dynamic arrays. And why pick on them? Do you every return string from a function? Surely the argument in this answer would lead you to regard that as a bad thing. – David Heffernan Mar 04 '14 at 23:48
  • @DavidHeffernan Teach me to skim over things. I definitely don't have a problem returning strings or a dynamic array from a function (heap performance issues aside). – Graymatter Mar 05 '14 at 00:01