Cannot get correct rows from Entity Framework / SQL Server

Question

I have a table in SQL Server:

select * from TaskSave

TaskID  SaveTypeID  ResultsPath PluginName                      PluginConfiguration
---------------------------------------------------------------------
  92       1         NULL       NULL                            NULL
  92       7         NULL       RGP_MSWord.WordDocumentOutput   www|D:\Users\Peter\Documents\Temp
  92       7         NULL       RGP_WC.WCOutput                 wcwc|D:\Users\Peter\Documents\Temp|.docx|123|456|789

which I am trying to read with C# / Entity Framework:

public static List<TaskSave> GetSavesPerTask(Task task)
{
    using (Entities dbContext = new Entities())
    {
        var savesPerTask = from p in dbContext.TaskSaves.Include("SaveType").Where(q => q.TaskID == task.TaskID) select p;
        //return savesPerTask.ToList();

        // DEBUG
        var x = savesPerTask.ToList();

        foreach (var y in x)
        {
            Console.WriteLine("SaveTypeID {0}, Plugin Name {1}", y.SaveTypeID, y.PluginName);
        }

        return x;
    }
}

The business rule states that TaskID + SaveTypeID + PluginName are unique, i.e. a task can have more than one plug-in. If the save type is not ‘plugin’, then TaskID + SaveTypeID must be unique.

My problem is that the GetSavesPerTask method returns the wrong results. It retrieves three rows, but row 2 is duplicated – I get 2 rows with PluginName of RGP_MSWord.WordDocumentOutput and not the RGP_WC.WCOutput row. The debug print shows:

SaveTypeID 1, Plugin Name 
SaveTypeID 7, Plugin Name RGP_MSWord.WordDocumentOutput
SaveTypeID 7, Plugin Name RGP_MSWord.WordDocumentOutput

Both the debugger and the ultimate user of the data agree that the third row is absent.

I have tried removing the include clause, but that makes no difference to the result set. Here is the SQL (from the simpler case) as reported by the debugger:

savesPerTask    {SELECT 
    [Extent1].[TaskID] AS [TaskID], 
    [Extent1].[SaveTypeID] AS [SaveTypeID], 
    [Extent1].[ResultsPath] AS [ResultsPath], 
    [Extent1].[PluginName] AS [PluginName], 
    [Extent1].[PluginConfiguration] AS [PluginConfiguration]
    FROM (SELECT 
    [TaskSave].[TaskID] AS [TaskID], 
    [TaskSave].[SaveTypeID] AS [SaveTypeID], 
    [TaskSave].[ResultsPath] AS [ResultsPath], 
    [TaskSave].[PluginName] AS [PluginName], 
    [TaskSave].[PluginConfiguration] AS [PluginConfiguration]
    FROM [dbo].[TaskSave] AS [TaskSave]) AS [Extent1]
    WHERE [Extent1].[TaskID] = @p__linq__0}

I have copy and pasted the SQL from the debugger into SSMS and it gets the correct results. I have tried deleting and recreating those two table in the EF model. I have recompiled many times as I’ve tried different things (adding debug, refreshing the model, moving the return outside of the using block, etc.).

The TaskSave table does not have a primary key, so I get Error 6002 during compilation (which I ignore as it does not seem to affect anything else and I can’t find any work-around on the internet)

'Database.dbo.TaskSave' does not have a primary key defined. The key has been inferred and the definition was created as a read-only table/view.

How can I fix the GetSavesPerTask method to return the correct rows? What silly mistake have I made? I have many other tables and CRUD operations, which seem to work as expected.

Here are my versions

.NET 4.6.01055
C# 2015
SQL Server 2014
SQL Server data tools 14.0.50730.0

*"The TaskSave table does not have a primary key"* Here is the cause of your problem. — Ivan Stoev, Feb 13 '16 at 21:23
See [Entity Framework: table without primary key](http://stackoverflow.com/questions/3996782/entity-framework-table-without-primary-key) — Ivan Stoev, Feb 13 '16 at 21:58

score 2 · Accepted Answer · answered Feb 13 '16 at 22:14

Most likely, the trouble just simply is the lack of an explicit primary key. Every table in your database ought to have a primary key anyway...

What happens here is: since there's no primary key, EF will just use all non-nullable columns from the table (or the view) as a "replacement" PK. Not sure which those are in your case - possibly (TaskID, SaveTypeID) ??

And when EF reads the data, it will go:

read the row
check the primary key (or the "stand-in" PK in this case)
if it has already read a row with that PK - it just duplicates that row that it already has - it will disregard any non-PK columns from the table/view!

So in your case, once it's read a first row with the "stand-in" PK being TaskID = 92, SaveTypeID = 7, any further rows with those two values will just get the already read values - no matter what's stored in the database!

SOLUTION: as I said before: EVERY table OUGHT TO HAVE a proper primary key to uniquely identify each individual row!

In your case, I'd recommend to just add a surrogate key like this:

ALTER TABLE dbo.TaskSave
ADD TaskSaveID INT IDENTITY(1,1) NOT NULL

ALTER TABLE dbo.TaskSave
ADD CONSTRAINT PK_YourTableName
    PRIMARY KEY CLUSTERED(TaskSaveID)

Now, each row has its own, unique TaskSaveID, and EF can detect the proper PK, and all your troubles should be gone.

A surrogate identity is enough to solve the problem at hand and is a sufficiently correct answer. A more _recommended_ schema to me would be to set `TaskID` and `SaveTypeID` as primary key and create a separate `TaskSavePlugin` table with `TaskID`, `SaveTypeID` and `PluginName` as PK (`PluginName` is now not nullable). Then call `dbContext.TaskSaves.Include("SaveType").Include("SaveTypePlugins")` — Rhumborl, Feb 13 '16 at 22:23
@Rhumborl: I quite agree! This is a "quick fix" - to *at least* solve the missing PK design crime..... — marc_s, Feb 13 '16 at 22:24
Adding a primary key (identity column) fixed the problem and was easy to do. The `TaskSave` table started as a simple junction table but it grew until it broke. — Peter Bill, Feb 14 '16 at 19:02

Cannot get correct rows from Entity Framework / SQL Server

1 Answers1