44

I want to know what is the difference between creating classes with or without using "hashset" in constructor.

Using code first approach (4.3) one can creat models like this:

public class Blog
 {
     public int Id { get; set; }
     public string Title { get; set; }
     public string BloggerName { get; set;}
     public virtual ICollection<Post> Posts { get; set; }
  }

public class Post
 {
    public int Id { get; set; }
    public string Title { get; set; }
    public DateTime DateCreated { get; set; }
    public string Content { get; set; }
    public int BlogId { get; set; }
    public ICollection<Comment> Comments { get; set; }
 }

or can create models like this :

public class Customer
{
    public Customer()
    {
        BrokerageAccounts = new HashSet<BrokerageAccount>();
    }
    public int Id { get; set; }
    public string FirstName { get; set; }
    public ICollection<BrokerageAccount> BrokerageAccounts { get; set; }
}

public class BrokerageAccount
{

    public int Id { get; set; }
    public string AccountNumber { get; set; }
    public int CustomerId { get; set; }

}

What is hashset doing here?

should i use hashset in the first two models also?

is there any article which shows the application of hashset?

Darin Dimitrov
  • 1,023,142
  • 271
  • 3,287
  • 2,928
Amir Jalali
  • 3,132
  • 4
  • 33
  • 46

3 Answers3

26

Generally speaking, it is best to use the collection that best expresses your intentions. If you do not specifically intend to use the HashSet's unique characteristics, I would not use it.

It is unordered and does not support lookups by index. Furthermore, it is not as well suited for sequential reads as other collections, and the fact that it allows you to add the same item multiple times without creating duplicates is only useful if you have a reason to use it for that. If that is not your intention, it can hide misbehaving code and make problems difficult to isolate.

The HashSet is mostly useful in situations where insertion and removal times are very important, such as when processing data. It is also extremely useful for comparing sets of data (again when processing) using operations like intersect, except, and union. In any other situation, the cons generally outweigh the pros.

Consider that when working with blog posts, inserts and removes are quite rare compared to reads, and you generally want to read the data in a specific order, anyway. That is more or less the exact opposite of what the HashSet is good at. It is highly doubtful that you would ever intend to add the same post twice, for any reason, and I see no reason why you would use set-based operations on posts in a class like that.

Daniel
  • 582
  • 8
  • 15
26

The HashSet does not define the type of collection that will be generated when you actually fetch data. This will always be of type ICollection as declared.

The HashSet created in the constructor is to help you avoid NullReferenceExceptions when no records are fetched or exist in the many side of the relationship. It is in no way required.

For example, based on your question, when you try to use a relationship like...

var myCollection = Blog.Posts();

If no Posts exist then myCollection will be null. Which is OK, until you fluent chain things and do something like

var myCollectionCount = Blog.Posts.Count();

which will error with a NullReferenceException.

Where as

var myCollection = Customer.BrokerageAccounts();
var myCollectionCount = Customer.BrokerageAccounts.Count();

will result in and empty ICollection and a zero count. No exceptions :-)

NER1808
  • 1,829
  • 2
  • 33
  • 45
  • 2
    Is the `()` on properties valid (`Blog.Posts()`)? Shouldn't it just be `Blog.Posts` to access the field? – bradlis7 Jan 30 '15 at 19:48
  • This seems to be wrong. The debugger shows me exactly the type I use in my constructor, even for data fetched from the database. This is also reflected in different behaviors when accessing the collection (ex. through DataBinding on those collections). – linac Mar 18 '15 at 13:44
  • 1
    @linac It's not the HashSet that defines the return type, but the definition of the ICollection property. The HashSet is used to just initializes the ICollection property. If you don't initialize the property in the constructor, the debugger will still show the ICollection type as defined. Nothing to do with the HashSet!! – NER1808 Apr 07 '15 at 10:46
  • You'll have to mark your property as virtual, for EF to override the collection type. Otherwise it has no other option than to keep the available list. – Joep Beusenberg Aug 12 '16 at 08:58
21

I'm fairly new to Entity Framework but this is my understanding. The collection types can be any type that implements ICollection<T>. In my opinion a HashSet is usually the semantically correct collection type. Most collections should only have one instance of a member (no duplicates) and HashSet best expresses this. I have been writing my classes as shown below and this has worked well so far. Note that the collection is typed as ISet<T> and the setter is private.

public class Customer
{
    public Customer()
    {
        BrokerageAccounts = new HashSet<BrokerageAccount>();
    }
    public int Id { get; set; }
    public string FirstName { get; set; }
    public ISet<BrokerageAccount> BrokerageAccounts { get; private set; }
}
Jamie Ide
  • 48,427
  • 16
  • 81
  • 117