19

My EF models look like this:

public class ContentStatus
{
    public ContentStatus()
    {
        this.Contents = new List<Content>();
    }

    public int ContentStatusId { get; set; }
    public string Name { get; set; }
    public virtual ICollection<Content> Contents { get; set; }
}

However I have also seen implementatins looking like this:

public class ContentStatus
{
    public ContentStatus()
    {
        this.Contents = new HashSet<Content>();
    }

    public int ContentStatusId { get; set; }
    public string Name { get; set; }
    public virtual ICollection<Content> Contents { get; set; }
}

Here is the DDL for this Object:

CREATE TABLE [dbo].[ContentStatus] (
    [ContentStatusId] INT           NOT NULL,
    [Name]            NVARCHAR (50) NOT NULL,
    CONSTRAINT [PK_ContentStatus] PRIMARY KEY CLUSTERED ([ContentStatusId] ASC)
);

Can anyone tell me which I should use or even is there a difference and when would I use the List and when the HashSet if that applies.

Thanks

Alan2
  • 23,493
  • 79
  • 256
  • 450

2 Answers2

7

It depends on your use case but in most cases you can add an item to the collection only once because for example each status is applied only once to a content. I doubt you can have one content appear twice in a status. Therefore HashSet is the correct data structure as it will prevent duplicates. In case where one item can be duplicated List would be correct but I have not encountered this in practice and do not even know how EF would handle it.

As a side note I would advise that you do not include a collection of items in your entities unless you need it. For example if you are building a web app to list products you probably have a view where you display a single product together with its tags. Therefore Product should have a collection of Tags to make this case easy. However you probably do not have a page that displays a Tag with its collection of products and therefore the Tag should not have a Products property. It just doesn't care about related products. It seems that this Status entity does not care about its collection of Contents.

Stilgar
  • 22,354
  • 14
  • 64
  • 101
0

So a HashSet<T> is definition

And a List<T> doesn't have those capabilities.

So I guess it is down to desired characteristics and performance (which on small sets is negligible). They can both be enumerated over.

Performance (although likely tiny differences) will be apparent in two parts, read and write. As Sean commented there will be likely penalties for writes due to hash code calculations and uniqueness comparisons. But reads are extremely fast (o(1)).

So really, it is all down to desired characteristics.

In my projects, I would use List<T>, but that is my sort of convention. You can determine your own convention, as long as you stick to it.

Community
  • 1
  • 1
Callum Linington
  • 14,213
  • 12
  • 75
  • 154
  • `HashSet` can be iterated over. Any class that implements `IEnumerable` can, and both `List` and `HashSet` implement it. – Xavier Poinas Nov 24 '15 at 10:11
  • I would assume that Hashset would be more costly for inserts, since it ensures uniqueness and hence must compare against other entries or at least their hashes, but less costly for removal (assuming a hash finds items faster than the enumeration List would do). – Sean B Mar 21 '17 at 16:27
  • @SeanB yes, but is all irrelevant on small sets of data. I was measuring perfs on arrays and hashsets the other night and for 2000 items I had to measure in ticks just to see some numbers (ms wasn't small enough) – Callum Linington Mar 21 '17 at 16:40
  • @CallumLinington, fair point, but then your answer's conclusion of performance being the deciding factor is moot. – Sean B Mar 21 '17 at 16:44
  • @SeanB yup, this answer seems more like general answer to the approach (i did it over a year ago). – Callum Linington Mar 21 '17 at 16:47