0

I'm using EF 6 Code First and trying to figure out the best way to configure my relationships using data annotations, but for some reason EF is adding extra columns to the schema that I don't want.

I have two entities: Ship and Voyage. A ship can have many voyages; a voyage belongs to one and only one ship. So I started out with this (simplified for SO purposes):

public class Ship
{
    public int Id { get; set; }
    public virtual ICollection<Voyage> Voyages { get; set; }
}

public class Voyage
{
    public int Id { get; set; }
    public int ShipId { get; set; }
    public virtual Ship Ship { get; set; }
    public DateTimeOffset Started { get; set; }
}

This creates two tables in the database, as follows:

CREATE TABLE [dbo].[Ship] (
    [Id]                INT                IDENTITY (1, 1) NOT NULL,
    CONSTRAINT [PK_dbo.Ship] PRIMARY KEY CLUSTERED ([Id] ASC)
);

CREATE TABLE [dbo].[Voyage] (
    [Id]               INT                IDENTITY (1, 1) NOT NULL,
    [ShipId]           INT                NOT NULL,
    [Started]          DATETIMEOFFSET (7) NOT NULL,
    CONSTRAINT [PK_dbo.Voyage] PRIMARY KEY CLUSTERED ([Id] ASC),
    CONSTRAINT [FK_dbo.Voyage_dbo.Ship_ShipId] FOREIGN KEY ([ShipId]) REFERENCES [dbo].[Ship] ([Id]) ON DELETE CASCADE
);

So far, so good. The problem comes when I want to query this data. I need to obtain a list of ships, but for each ship I want the most recent voyage as well. I couldn't see a way of writing a LINQ query that would do this in one hit, even though I would be able to write such a query using SQL.

My options at this point seemed to be:

  • Load all ships, and for each ship eagerly load all of its voyages. I didn't want to do this because of the performance implications (each ship may ultimately have lots of voyages)
  • Load all ships first, then foreach over the resulting list and perform a separate query for each ship to load its "most recent" voyage. Although this works it seems kind of inefficient compared to a single query.

I then thought I could add a navigation property to the Ship entity that would be used to reference directly the "most recent" voyage. So I tried this:

public class Ship
{
    public int Id { get; set; }
    public virtual ICollection<Voyage> Voyages { get; set; }
    public int? MostRecentVoyageId { get; set; }    <-- new property added
    public virtual Voyage MostRecentVoyage { get; set; }    <-- new property added
}

I made MostRecentVoyageId nullable because a ship will not have any voyages at all when it is first created.

This gives me the following in the database:

CREATE TABLE [dbo].[Ship] (
    [Id]                  INT                IDENTITY (1, 1) NOT NULL,
    [MostRecentVoyageId]  INT                NULL,
    CONSTRAINT [PK_dbo.Ship] PRIMARY KEY CLUSTERED ([Id] ASC),
    CONSTRAINT [FK_dbo.Ship_dbo.Voyage_MostRecentVoyageId] FOREIGN KEY ([MostRecentVoyageId]) REFERENCES [dbo].[Voyage] ([Id])
);

which is fine for the Ship table, but I get this for the Voyage table:

CREATE TABLE [dbo].[Voyage] (
    [Id]               INT                IDENTITY (1, 1) NOT NULL,
    [ShipId]           INT                NOT NULL,
    [Started]          DATETIMEOFFSET (7) NOT NULL,
    [Ship_Id]          INT                NULL,
    CONSTRAINT [PK_dbo.Voyage] PRIMARY KEY CLUSTERED ([Id] ASC),
    CONSTRAINT [FK_dbo.Voyage_dbo.Ship_ShipId] FOREIGN KEY ([ShipId]) REFERENCES [dbo].[Ship] ([Id]) ON DELETE CASCADE,
    CONSTRAINT [FK_dbo.Voyage_dbo.Ship_Ship_Id] FOREIGN KEY ([Ship_Id]) REFERENCES  [dbo].[Ship] ([Id])
);

Note the extra Ship_Id column and the additional foreign key relationship. I'm pretty certain that this column isn't needed, but I can't find a way of getting rid of it. I've tried using the [ForeignKey] and [InverseProperty] attributes but these just give me an exception.

Is this the kind of relationship I can't configure using EF data annotations? Do I have to use the fluent syntax for this? Am I doing it all wrong anyway: is there a better way to do the LINQ queries using my original entity types?

I have tried looking on SO for people having similar problems and I found this, this and this but none of those seemed to help.

Community
  • 1
  • 1
Steven Rands
  • 5,160
  • 3
  • 27
  • 56
  • I had almost the exact same model but in a different context and I got rid of the unwanted column in the database by doing the equivelent of adding `[ForeignKey("MostRecentVoyage")]` to MostRecentVoyageId and adding `[InverseProperty("Ship")]` to voyages. Does this throw the exceptions you mention and if so what are the exceptions. – Ben Robinson Nov 18 '14 at 11:35
  • @BenRobinson I've just tried that, it doesn't throw an exception, but I still end up with the extra `Ship_Id` column on the `Voyage` table. I assume that the `[InverseProperty]` attribute should go on the voyage's `ShipId` property? – Steven Rands Nov 18 '14 at 12:17
  • No the `[InverseProperty("Ship")]` should go on the `Voyages` property on the `Ship` entity. – Ben Robinson Nov 18 '14 at 12:24
  • @BenRobinson Awesome! That works like a charm. Many thanks. Would you be able to put your comment into an answer so that I could accept it? – Steven Rands Nov 18 '14 at 13:26
  • Added it as an answer as requested. – Ben Robinson Nov 18 '14 at 13:30

1 Answers1

1

I had almost the exact same model but in a different context and I got rid of the unwanted column in the database by doing the equivalent of adding [ForeignKey("MostRecentVoyage")] attribute to MostRecentVoyageId and adding [InverseProperty("Ship")] attribute to the Voyages property both on the Ship entity/class.

Ben Robinson
  • 21,601
  • 5
  • 62
  • 79
  • I've since found a way of doing a single LINQ query to get the data back in one "hit" but the generated SQL is more complex than when using the `MostRecentVoyage` foreign key, ie. `ship.MostRecentVoyage` vs `ship.Voyages.OrderByDescending(x=>x.Started).FirstOrDefault()`. Do you think the explicit foreign key is a bad architecture decision or is it worth the added maintenance cost in order to get a more performant query? (Assuming it is more performant, I haven't benchmarked either yet) – Steven Rands Nov 18 '14 at 13:48
  • Becnhmark and if calculating the value dynamically is fast enough for your purposes then go with that. Then you don't have to worry about making sure the foreign key is always correct which can be a maintenance pain as you say. – Ben Robinson Nov 18 '14 at 13:56