6

When I use TObjectDictionary, where TKey is object, my application work uncorrectly. I have two units, thats contain two classes. First unit:

unit RubTerm;

interface

type
  TRubTerm = Class(TObject)
  private
    FRubricName: String;
    FTermName: String;
  public
    property RubricName: String read FRubricName;
    property TermName: String read FTermName;
    constructor Create(ARubricName, ATermName: String);
  end;

implementation

constructor TRubTerm.Create(ARubricName, ATermName: String);
begin
  Self.FRubricName := ARubricName;
  Self.FTermName := ATermName;
end;

end;

And second unit:

unit ClassificationMatrix;

interface

uses
  System.Generics.Collections, System.Generics.Defaults, System.SysUtils, RubTerm;

type
TClassificationMatrix = class(TObject)
  private
    FTable: TObjectDictionary<TRubTerm, Integer>;
  public
    constructor Create;
    procedure TClassificationMatrix.AddCount(ADocsCount: Integer; ARubName, ATermName: String);
    function TClassificationMatrix.GetCount(ARubName, ATermName: String): Integer;
  end;

implementation

constructor TClassificationMatrix.Create;
begin
  FTable := TObjectDictionary<TRubTerm, Integer>.Create;
end;

procedure TClassificationMatrix.AddCount(ADocsCount: Integer; ARubName, ATermName: String);
var
  ARubTerm: TRubTerm;
begin
  ARubTerm := TRubTerm.Create(ARubName, ATermName);
  FTable.Add(ARubTerm, ADocsCount);
end;

function TClassificationMatrix.GetCount(ARubName, ATermName: String): Integer;
var
  ARubTerm: TRubTerm;
begin
  ARubTerm := TRubTerm.Create(ARubName, ATermName);
  FTable.TryGetValue(ARubTerm, Result);
end;

end;

But this fragment of code work unnormal:

procedure TestTClassificationMatrix.TestGetCount;
var
  DocsCountTest: Integer;
begin
  FClassificationMatrix.AddCount(10, 'R', 'T');
  DocsCountTest := FClassificationMatrix.GetCount('R', 'T');
end;
// DocsCountTest = 0! Why not 10? Where is problem?

Thanks!

Andrew
  • 157
  • 2
  • 7
  • 1
    You have to add an equality comparer to let the dictionary know, what do you mean by equal. Otherwise the key index is build on the instance reference – Sir Rufo Aug 05 '13 at 23:25

2 Answers2

8

The fundamental issue here is that the default equality comparer for your type does not behave the way you want it to. You want equality to mean value equality, but the default comparison gives reference equality.

The very fact that you are hoping for value equality is a strong indication that you should be using a value type rather than a reference type. And that's the first change that I would suggest.

type
  TRubTerm = record
    RubricName: string;
    TermName: string;
    class function New(const RubricName, TermName: string): TRubTerm; static;
    class operator Equal(const A, B: TRubTerm): Boolean;
    class operator NotEqual(const A, B: TRubTerm): Boolean;
  end;

class function TRubTerm.New(const RubricName, TermName: string): TRubTerm;
begin
  Result.RubricName := RubricName;
  Result.TermName := TermName;
end;

class operator TRubTerm.Equal(const A, B: TRubTerm): Boolean;
begin
  Result := (A.RubricName=B.RubricName) and (A.TermName=B.TermName);
end;

class operator TRubTerm.NotEqual(const A, B: TRubTerm): Boolean;
begin
  Result := not (A=B);
end;

I've added TRubTerm.New as a helper method to make it easy to initialize new instances of the record. And for convenience, you may also find it useful to overload the equality and inequality operators, as I have done above.

Once you switch to a value type, then you would also change the dictionary to match. Use TDictionary<TRubTerm, Integer> instead of TObjectDictionary<TRubTerm, Integer>. Switching to a value type will also have the benefit of fixing all the memory leaks in your existing code. Your existing code creates objects but never destroys them.

This gets you part way home, but you still need to define an equality comparer for your dictionary. The default comparer for a record will be based on reference equality since strings, despite behaving as value types, are stored as references.

To make a suitable equality comparer you need to implement the following comparison functions, where T is replaced by TRubTerm:

TEqualityComparison<T> = reference to function(const Left, Right: T): Boolean;
THasher<T> = reference to function(const Value: T): Integer;

I'd implement these as static class methods of the record.

type
  TRubTerm = record
    RubricName: string;
    TermName: string;
    class function New(const RubricName, TermName: string): TRubTerm; static;
    class function EqualityComparison(const Left, 
      Right: TRubTerm): Boolean; static;
    class function Hasher(const Value: TRubTerm): Integer; static;
    class operator Equal(const A, B: TRubTerm): Boolean;
    class operator NotEqual(const A, B: TRubTerm): Boolean;
  end;

Implementing EqualityComparison is easy enough:

class function TRubTerm.EqualityComparison(const Left, Right: TRubTerm): Boolean;
begin
  Result := Left=Right;
end;

But the hasher requires a little more thought. You need to hash each field individually and then combine the hashes. For reference:

The code looks like this:

{$IFOPT Q+}
  {$DEFINE OverflowChecksEnabled}
  {$Q-}
{$ENDIF}
function CombinedHash(const Values: array of Integer): Integer;
var
  Value: Integer;
begin
  Result := 17;
  for Value in Values do begin
    Result := Result*37 + Value;
  end;
end;
{$IFDEF OverflowChecksEnabled}
  {$Q+}
{$ENDIF}

function GetHashCodeString(const Value: string): Integer;
begin
  Result := BobJenkinsHash(PChar(Value)^, SizeOf(Char) * Length(Value), 0);
end;

class function TRubTerm.Hasher(const Value: TRubTerm): Integer;
begin
  Result := CombinedHash([GetHashCodeString(Value.RubricName), 
    GetHashCodeString(Value.TermName)]);
end;

Finally, when you instantiate your dictionary, you need to provide an IEqualityComparison<TRubTerm>. Instantiate your dictionary like this:

Dict := TDictionary<TRubTerm,Integer>.Create(
  TEqualityComparer<TRubTerm>.Construct(
    TRubTerm.EqualityComparison,
    TRubTerm.Hasher
  )
);
Community
  • 1
  • 1
David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • 1
    Great work as always. But as an aside, taking a step back for a second, doesn't this seem like a lot of work compared to some other languages? Do you imagine the groundwork to simplify this some will come with XE5 and the nextgen compiler? Value types vs. reference types, objects vs. Delphi's just-because-we-don't-have-memory-management records with methods, TDictionary vs. TObjectDictionary, 11 different comparer classes... it just seems like the language is ballooning out of control to get around a lot of its defects that have never been addressed. And "BobJenkinsHash" instead of "hash"? :-( – alcalde Aug 06 '13 at 18:15
  • 1
    @alcade I agree that the syntax is too clumsy and verbose. – David Heffernan Aug 06 '13 at 18:23
3

A Dictionary depends on a key value. You are storing a reference to an object in the key. If you create two objects that are setup identically the have different values and hence different keys.

var
  ARubTerm1: TRubTerm;
  ARubTerm2: TRubTerm;
begin
  ARubTerm1 := TRubTerm.Create('1', '1');
  ARubTerm2 := TRubTerm.Create('1', '1');
 //  ARubTerm1 = ARubTerm2 is not possible here as ARubTerm1 points to a different address than ARubTerm2
end;

Instead you could uses a String as the First Type Parameter in the TObjectDictonary that is based on RubricName and TermName. With this you would then get back the same value.

It should also be noted, that above code in XE2 creates two memory leaks. Every object created must be freed. Hence this section of code also is leaking memory

function TClassificationMatrix.GetCount(ARubName, ATermName: String): Integer;
var
  ARubTerm: TRubTerm;
begin
  ARubTerm := TRubTerm.Create(ARubName, ATermName);
  FTable.TryGetValue(ARubTerm, Result);
end;

Given all of that. If you want to use an Object as a Key you can do it with a Custom Equality Comparer. Here is your example changed to implement IEqualityComparer<T>, and fix a few memory leaks.

unit ClassificationMatrix;

interface

uses
  Generics.Collections, Generics.Defaults, SysUtils, RubTerm;

type
TClassificationMatrix = class(TObject)
  private
    FTable: TObjectDictionary<TRubTerm, Integer>;
  public
    constructor Create;
    procedure AddCount(ADocsCount: Integer; ARubName, ATermName: String);
    function GetCount(ARubName, ATermName: String): Integer;
  end;

implementation

constructor TClassificationMatrix.Create;
var
 Comparer : IEqualityComparer<RubTerm.TRubTerm>;
begin
  Comparer := TRubTermComparer.Create;
  FTable := TObjectDictionary<TRubTerm, Integer>.Create([doOwnsKeys],TRubTermComparer.Create);
end;

procedure TClassificationMatrix.AddCount(ADocsCount: Integer; ARubName, ATermName: String);
var
  ARubTerm: TRubTerm;
begin
  ARubTerm := TRubTerm.Create(ARubName, ATermName);
  FTable.Add(ARubTerm, ADocsCount);
end;

function TClassificationMatrix.GetCount(ARubName, ATermName: String): Integer;
var
  ARubTerm: TRubTerm;
begin
  ARubTerm := TRubTerm.Create(ARubName, ATermName);
  try
   if Not FTable.TryGetValue(ARubTerm, Result) then
      result := 0;
  finally
    ARubTerm.Free;
  end;
end;

end.

And the RubTerm.pas unit

unit RubTerm;

interface
uses Generics.Defaults;

type
  TRubTerm = Class(TObject)
  private
    FRubricName: String;
    FTermName: String;
  public
    property RubricName: String read FRubricName;
    property TermName: String read FTermName;
    constructor Create(ARubricName, ATermName: String);
    function GetHashCode: Integer; override;
  end;

  TRubTermComparer = class(TInterfacedObject, IEqualityComparer<TRubTerm>)
  public
    function Equals(const Left, Right: TRubTerm): Boolean;
    function GetHashCode(const Value: TRubTerm): Integer;
  end;


implementation

constructor TRubTerm.Create(ARubricName, ATermName: String);
begin
  Self.FRubricName := ARubricName;
  Self.FTermName := ATermName;
end;


{ TRubTermComparer }

function TRubTermComparer.Equals(const Left, Right: TRubTerm): Boolean;
begin
  result := (Left.RubricName = Right.RubricName) and (Left.TermName = Right.TermName);
end;

function TRubTermComparer.GetHashCode(const Value: TRubTerm): Integer;
begin
  result := Value.GetHashCode;
end;

//The Hashing code was taken from David's Answer to make this a complete answer.    
{$IFOPT Q+}
  {$DEFINE OverflowChecksEnabled}
  {$Q-}
{$ENDIF}
function CombinedHash(const Values: array of Integer): Integer;
var
  Value: Integer;
begin
  Result := 17;
  for Value in Values do begin
    Result := Result*37 + Value;
  end;
end;
{$IFDEF OverflowChecksEnabled}
  {$Q+}
{$ENDIF}

function GetHashCodeString(const Value: string): Integer;
begin
  Result := BobJenkinsHash(PChar(Value)^, SizeOf(Char) * Length(Value), 0);
end;

function TRubTerm.GetHashCode: Integer;

begin
  Result := CombinedHash([GetHashCodeString(Value.RubricName), 
    GetHashCodeString(Value.TermName)]);    
end;

end.
Robert Love
  • 12,447
  • 2
  • 48
  • 80
  • 2
    This is broadly sound in the general, but badly wrong in the specifics. You cannot concatenate two strings and compare the result. If you do then you have 'a', '' = '', 'a' for example. You need to compare both fields. Same approach for the hash. And you cannot implement equals by using the hash. Different hash implies different values. But equal hash does not imply equal values. There are more values that there are hashes so that is obviously a false assumption. – David Heffernan Aug 06 '13 at 07:12
  • Updated my Equals implementation, but left Hashing alone as it covered better in your answer. – Robert Love Aug 06 '13 at 14:51
  • If you are not going to fix the hash code (and I see no reason why you should not fix it, and feel free to copy the code from my answer if you wish), you should at least make it clear in an edit that it is broken. – David Heffernan Aug 06 '13 at 14:55
  • Ok, I took the code from your answer on hashing. I never feel good at doing that without permission. – Robert Love Aug 06 '13 at 15:01
  • I think even that is better than leaving such an error in an accepted answer. Always best for the accepted answer to be accurate, for the benefit of future visitors. So, belatedly, +1! – David Heffernan Aug 06 '13 at 15:04
  • 1
    @RobertLove That was a beautiful, clear written explanation in English of what was going on and how to solve it. Seriously - textbook quality (learning a new language right now so I'm very familiar with how books explain this stuff). – alcalde Aug 06 '13 at 18:04