13

This Embarcadero article discussing memory issues for the XE7 IDE contains the following:

Be aware of the “Growth by Generics”

Another scenario that might depend on your application code and cause an increase of the memory used by the compiler and the debugger relates with the way generic data types are used. The way the Object Pascal compiler works can cause the generation of many different types based on the same generic definition, at times even totally identical types that are compiled in different modules. While we won’t certainly suggest removing generics, quite the contrary, there are a few options to consider:

  • Try to avoid circular unit references for units defining core generic types
  • Define and use the same concrete type definitions when possible
  • If possible, refactor generics to share code in base classes, from which a generic class inherits

The last item I understand. The first two I am less clear on.

Do these issues affect only IDE performance, or is there an impact on the size of the compiled code?

For instance, considering the second item, if I declare TList<Integer> in two separate units, will I get two separate chunks of code in each of those units in my executable? I certainly hope not!

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • How would you link a DCU that didn't have code for the generic types it used? The compiler can't know that the unit will always be compiled together with another unit that shares a generic type. It would need to be the linker that could retroactively remove duplicate generic types from the DCUs. That seems challenging... – J... Jul 28 '15 at 19:20
  • 4
    @J... That's Embarcadero's problem. Do C++ and C# tool chains have any problems doing this? – David Heffernan Jul 28 '15 at 19:26
  • I don't disagree, I'm just observing a possible reason why they've let it slide. – J... Jul 28 '15 at 19:35
  • 1
    Has there, to your (or anyones) knowledge, been changes on this topic in newer versions of Delphi (Seattle/Berlin?). – Tupel Apr 26 '16 at 08:15
  • FWIW C++ does have this issue as well (google for template code bloat) and the general advice there as well is to refactor out all code that is not depending on the type param. However there is a nice thing to prevent code bloat - see https://stackoverflow.com/questions/8130602/using-extern-template-c11 – Stefan Glienke Feb 28 '19 at 13:56

2 Answers2

8

Point 2. This refers to instantiating same generic type where possible. For instance using TList<Integer> in all places instead of having two generic types TList<Integer> and TList<SmallInt>.

Declaring and using TList<Integer> in several units will only include single copy of TList<Integer> in exe file. Also, declaring TIntegerList = TList<Integer> will result with same.

Generic bloat people are referring to relates to having complete TList<T> copy for each specific type you use even though underlying generated code is the same.

For instance: TList<TObject> and TList<TPersistent> will include two separate copies of TList<T> even though generated code could be folded to single one.

That moves us to Point 3. where using base class for common class code and then using generic classes on top of that to get type safety, can save you memory both during compilation and in exe file.

For example, building generic class on top of non generic TObjectList will only include thin generic layer for each specific type instead of complete TObjectList functionality. Reported as QC 108966

  TXObjectList<T: class, constructor> = class(TObjectList)
  protected
    function GetItem(index: Integer): T;
    procedure SetItem(index: Integer; const Value: T);
  public
    function Add: T;
    property Items[index: Integer]: T read GetItem write SetItem; default;
  end;

function TXObjectList<T>.GetItem(index: Integer): T;
begin
  Result := T( inherited GetItem(index));
end;

procedure TXObjectList<T>.SetItem(index: Integer; const Value: T);
begin
  inherited SetItem(index, Value);
end;

function TXObjectList<T>.Add: T;
begin
  Result := T.Create;
  inherited Add(Result);
end;
Dalija Prasnikar
  • 27,212
  • 44
  • 82
  • 159
  • 4
    Deriving from TObjectList would stop you writing code that accepted `TList` and supplying `TObjectList`. Emba's XE8 changes to System.Generics attacked this a different way but they screwed it up because they don't have decent unit tests. You know this I am sure. The net result is dire RTL code that is unreadable. They should bite the bullet and do the job properly. A half assed solution is worse than before in my view. Thank you for your commentary on Point 2. – David Heffernan Jul 28 '15 at 19:59
  • Dalija, this is not true. I just compiled a simple console app with two units, Unit4 and Unit5. Unit4 had a single procedure Test4, and Unit5 a single procedure Test5. If I made Test5 empty, but Test4 used a TList, the .exe size was 2048k. If I gave Test5 exactly the same code, the .exe was 2067k. So this added 19k to the executable. – Rudy Velthuis Jul 28 '15 at 20:18
  • @RudyVelthuis I will have to recheck with console app. With regular app my points stand. I have been fighting with generics bloat since Delphi XE so I have pretty good idea what I am talking about. – Dalija Prasnikar Jul 28 '15 at 20:25
  • "Regular" apps already use a lot of generics below the surface, so a few more instantiations won't make a big difference. But there is still a difference in size and each TList is instantiated anew. – Rudy Velthuis Jul 28 '15 at 20:27
  • @RudyVelthuis I had zero size increase with using TList in different units, even when they are declared in different packages. It would be impossible to miss 19K size increase. – Dalija Prasnikar Jul 28 '15 at 20:35
  • @DavidHeffernan Yep, if you use `TObjectList` you cannot pass it as `TList` (not to mention mobile compilers), but that is example of what you can do if you want or need to fight generics bloat. The only proper solution is creating better compiler and that is the one thing we cannot do. – Dalija Prasnikar Jul 28 '15 at 20:38
  • @RudyVelthuis can you post your console app project somewhere, I cannot reproduce what you are claiming with console app either (using XE4) – Dalija Prasnikar Jul 28 '15 at 20:44
  • I don't have a website, a the moment. Usually I would post there. Is there a free site to post code like this? – Rudy Velthuis Jul 28 '15 at 21:16
  • @RRUZ: thanks, that is what I meant. But I deleted my answer. I do see different sizes, but the same addresses, so I am not sure what causes the different size. – Rudy Velthuis Jul 28 '15 at 21:24
  • @Dalija I expect to accept this in due course. I want to do my own experiments once I can reach a compiler. I'll likely write them up in an answer, and then accept yours. – David Heffernan Jul 29 '15 at 18:16
  • Take your time. I did some additional testing but the point remains, single copy is included in exe. My original tests (both debug and release mode) were done with `Linking - Debug information` disabled. When it is enabled there is some difference in exe size between various scenarios, but that most likely comes from additional debug information. Like @Rudy said address is the same, map file also shows only single copy of `TList` included. Hope that helps. – Dalija Prasnikar Jul 29 '15 at 18:29
  • This seems to be a neat solution! Anybody know how to view the mentioned QC 108966 since the old QC system has been abandoned? I did a search of '108966' in the new quality.embarcadero.com system without result. – Edwin Yip Oct 14 '18 at 15:19
  • 1
    @EdwinYip If you don't have QC reports archived, then you cannot access it. But, it is really old report. It contains code included in this answer as one of the possible solutions to quickly fix generic bloat for objects without involving the compiler. In newer Delphi versions provided Collections code bloat has been solved in another way with [TListHelper](http://docwiki.embarcadero.com/Libraries/Tokyo/en/System.Generics.Collections.TListHelper) class that covers all possible type sizes, not just objects. – Dalija Prasnikar Oct 14 '18 at 22:42
7

The code bloat they are talking about in the article (as it is about the out of memory issue in the IDE) is related to the generated DCUs and all the meta information that is held in the IDE. Every DCU contains all the used generics. Only when compiling your binary the linker will remove duplicates.

That means if you have Unit1.pas and Unit2.pas and both are using TList<Integer> both Unit1.dcu and Unit2.dcu have the binary code for TList<Integer> compiled in.

If you declare TIntegerList = TList<Integer> in Unit3 and use that in Unit1 and Unit2 you might think this would only include the compiled TList<Integer> in Unit3.dcu but not in the other two. But unfortunately that is not the case.

Stefan Glienke
  • 20,860
  • 2
  • 48
  • 102
  • So in this case having `TIntegerList = TList` declared in some common unit can help preserving memory in IDE, but will not have any effects on the binary file. – Dalija Prasnikar Jul 30 '15 at 09:12
  • 2
    @DalijaPrasnikar Unfortunately not because that is just an alias. It will still compile the type into each unit (just tested). Only writing `TIntegerList = class(TList)` would solve that - and imo that's not really a solution. – Stefan Glienke Jul 30 '15 at 09:46
  • 1
    Thanks.... Nothing we do on our side is really the solution, but knowing workarounds can help in certain situations. Can you please add your explanation into your answer. – Dalija Prasnikar Jul 30 '15 at 09:48
  • 4
    It strikes me that it is a terrible shame that they got the RTL people to destroy the classes in Sytem.Generics.Collections in XE8 rather than getting the compiler people to fix the real problem. Likewise the Spring collections have been redesigned in a similar manner. All it takes is for the linker to examine each instantiated method looking for duplicates to be merged. – David Heffernan Jul 31 '15 at 06:04