6

I am using Entity Framework to map two tables together using Entity Splitting as outlined here and here.

I have found that if I execute a .ToList() on an IQueryable<SplitEntity> then the results are from an Inner Join. However, If I take that same IQueryable and execute a .Count() it will return the number of records returned by a Full Join.

Here is a unit test that fails:

    [TestMethod]
    public void GetCustomerListTest()
    {
        // arrange
        List<Customer> results;
        int count;

        // act
        using (var context = new DataContext())
        {
            results = context.Customers.ToList();
            count = context.Customers.Count();
        }

        // assert
        Assert.IsNotNull(results); // succeeds
        Assert.IsTrue(results.Count > 0); // succeeds. Has correct records from inner join
        Assert.AreEqual(count, results.Count); // This line fails. Has incorrect count from full join.
    }

This strikes me as very bad. How can I get the .Count() method to return the results from an Inner Join like the .ToList()?

Update - SQL

I was wrong about the full vs inner joins.

The .ToList() results in:

    SELECT 
    [Extent1].[CustomerNumber] AS [CustomerNumber], 
    -- ...etc...
    [Extent2].[CustomerName] AS [CustomerName], 
    -- ... etc...
    FROM  [dbo].[CustomerTable1] AS [Extent1]
    INNER JOIN [dbo].[CustomerTable2] AS [Extent2] ON [Extent1].[CustomerNumber] = [Extent2].[CustomerNumber]

The .Count() results in:

SELECT 
[GroupBy1].[A1] AS [C1]
FROM ( SELECT 
    COUNT(1) AS [A1]
    FROM [dbo].[customerTable2] AS [Extent1]
)  AS [GroupBy1]

Update - DataContext and entity code

The DataContext:

public class DataContext : DbContext
    {
        public DataContext() { Database.SetInitializer<DataContext>(null); }

        public DbSet<Customer> Customers { get; set; }

        protected override void OnModelCreating(DbModelBuilder modelBuilder)
        {
            base.OnModelCreating(modelBuilder);

            modelBuilder.Configurations.Add(new CustomerMapping());
        }
    }
}

The Customer Mapping (FluentAPI):

public class CustomerMapping : EntityTypeConfiguration<Customer>
{
    public CustomerMapping()
    {
        this.Map( m => {
                    m.Properties( x => new { x.CustomerNumber, /*...etc...*/});
                    m.ToTable("CustomerTable1");
                })
        .Map( m => {
                    m.Properties( x => new { x.CustomerName, /*...etc...*/});
                    m.ToTable("CustomerTable2");
                });
    }
}

The Customer entity:

public class Customer
{
    [Key]
    public string CustomerNumber { get; set; }

    public string CustomerName { get; set; }
}
Community
  • 1
  • 1
quakkels
  • 11,676
  • 24
  • 92
  • 149
  • 1
    You haven't shown us the queries it generates... – Jeff Mercado Aug 01 '13 at 20:50
  • we need to see context.Customers to know more. If you perform a simply query then you will get the same count I think. – phillip Aug 01 '13 at 21:01
  • The question has been updated with additional context and entity code. – quakkels Aug 01 '13 at 21:20
  • Clearly there isn't a 1-1 relationship between CustomerTable1 records and CustomerTable2 records in the DB, or the counts would be the same... hmmm... maybe the relationship needs to be modeled differently to reflect that? Perhaps there is a latent assumption in EF that it is the case when working with entity-split entities. – J. Polfer Aug 02 '13 at 14:47
  • I only want records that exist in both places (inner join). Entity Framework is designed to use an Inner Join on these split entity configurations. But .Count() doesn't seem to honor that. – quakkels Aug 02 '13 at 21:09

2 Answers2

5

If the database and all records in CustomerTable1 and CustomerTable2 have been created by Entity Framework and SaveChanges calls in your application code this difference must not happen and you can go straight ahead and report this as a bug.

If you are mapping to an existing database or if other applications write records into the tables and you actually expect that not every record in CustomerTable1 has a corresponding record in CustomerTable2 and vice versa then Entity Splitting is the wrong mapping of your database schema.

Apparently the difference means that you can have Customers with a CustomerNumber (etc.), but without a CustomerName (etc.) - or the other way around. The better way to model this would be a one-to-one relationship where one side is required and the other side is optional. You will need an additional entity and a navigation property for this, for example like so:

[Table("CustomerTable1")]
public class Customer
{
    [Key]
    public string CustomerNumber { get; set; }
    // + other properties belonging to CustomerTable1

    public AdditionalCustomerData AdditionalCustomerData { get; set; }
}

[Table("CustomerTable2")]
public class AdditionalCustomerData
{
    [Key]
    public string CustomerNumber { get; set; }
    public string CustomerName { get; set; }
    // + other properties belonging to CustomerTable2
}

With this Fluent API mapping:

public class CustomerMapping : EntityTypeConfiguration<Customer>
{
    public CustomerMapping()
    {
        this.HasOptional(c => c.AdditionalCustomerData)
            .WithRequired()
            .WillCascadeOnDelete(true);
    }
}
Slauma
  • 175,098
  • 59
  • 401
  • 420
  • This is an existing database with an existing table structure. But my `Customer` class is a combination of two tables. Records can exist in `CustomerTable2` that don't exist in `CustomerTable1`. BUT an actual `Customer` entity exists only when those records overlap. This is all according to the existing rules of this database which I need to work within. Even with this reality, it strikes me that if `.Count()` uses a different query than `.ToList()` this should be considered a bug. – quakkels Aug 05 '13 at 13:33
  • 3
    @quakkels: I understand your argument and complaint. On the other hand using the `Count` query on only one of the tables is kind of a performance optimization, because the JOIN is not necessary *under the assumption* that the records in the 2 tables are in strict one-to-one relation which, I believe, is an assumption that EF makes when using Entity Splitting. Also, I think, the usage of an INNER JOIN is a decision based on the same assumption. It's purpose is not to filter out the customers that don't have data in *both* tables. I'd consider to report at CodePlex and see what the EF team says. – Slauma Aug 05 '13 at 17:13
1

I am querying a local table and I get the same count for both. I believe there is a problem with your context and that's why your results are inconsistent.

screenshot of essentially the same code just querying a simple dataset.

enter image description here

UPDATE:

I don't know why the SQL that is generated is different. You would think that they would be the same except for simply executing Count(*) instead of returning all the rows. That is obviously why you are getting a different counts. I just can't say why the SQL is different.

Maybe Jon Skeet or other genius will see this and answer! :)

phillip
  • 2,618
  • 19
  • 22
  • This is not a single table. This one entity `Customer` is mapped to TWO tables using Entity Splitting. It is discussed [here](http://stackoverflow.com/questions/6670580/mapping-multiple-tables-to-a-single-entity-class-in-entity-framework) and [here](http://msdn.microsoft.com/en-us/magazine/hh126815.aspx). – quakkels Aug 01 '13 at 21:11
  • I will add more code to my question. Perhaps that will help :-) – quakkels Aug 01 '13 at 21:14