For this example:
SELECT A.*
FROM Table1 A
WHERE A.Col1 IN (
SELECT B.Col1
FROM Table1 B
JOIN Table2 C ON B.Col1 = C.Col1
WHERE B.Col2 = 'Condition'
)
The example doesn't make any sense given A and B are aliases to the same table (Table1) and reading the entire thing looks like you've tried to build a simple example of a real-world query your are facing but have ended up writing something completely invalid.
The query you have could essentially be written as:
SELECT A.*
FROM Table1 A
INNER JOIN Table2 C ON A.Col1 = C.Col1;
Entity Framework is more than just a tool to substitute SQL with Linq Lambdas. Yes, with EF you can write Linq with Join
expressions between DbSets but 99.5% of the time you should never need to. Instead, you configure, or let EF work out the relationship between tables (provided the database follows a recognized naming and normalization convention) and EF takes care of the necessary joins behind the scenes.
Take for example you have a system with Customers & Orders. A Customer has many orders. Each of these tables would have an entity declared, but the relationship between these entities would also be mapped.
public class Customer
{
[Key]
public int CustomerId { get; set; }
// Other Customer details...
public virtual ICollection<Order> Orders { get; set; } = new List<Order>();
}
public class Order
{
[Key]
public int OrderId { get; set; }
public string OrderNumber { get; set; }
// Other Order details...
[ForeignKey(nameof(Customer))]
public int CustomerId { get; set; }
public virtual Customer Customer { get; set; }
}
In some examples you might find things like the [ForeignKey]
or relationship configuration is missing. EF can work out relationships automatically by convention, and unfortunately there are about 3 or 4 different ways that these relationships might be configured. (Annotations, vs. explicit configuration in EntityTypeConfiguration or during OnModelCreating, or via conventions) So unfortunately examples might be a bit confusing depending on where you look. I generally recommend always using explicit configuration to avoid surprises.
Now where in SQL if we wanted the order and customer details for a particular Order Number you would write something like:
SELECT *
FROM Orders o
INNER JOIN Customers c ON o.CustomerId = c.CustomerId
WHERE o.OrderNumber = '%OrderNumber%';
The Linq if we ignore navigation properties would be something like:
var details = context.Orders
.Where(o => o.OrderNumber == orderNumber)
.Join(context.Customers, o => o.CustomerId, c => c.CustomerId,
(o, c) => new {Order = o, Customer = c})
.Single();
Which is honestly pretty unsightly. However, with navigation properties you don't use Join
, you just access or eager load the associated navigation property. EF works out the SQL JOINs needed behind the scenes:
var order = context.Orders
.Include(o => o.Customer)
.Single(o => o.OrderNumber == orderNumber);
Here we load the single Order record and eager-load the navigation property for its customer. Now when you want customer details for the order, just access order.Customer. Alternatively if we want a particular Customer and it's Orders:
var customer = context.Customers
.Include(c => c.Orders)
.Single(c => c.CustomerId == customerId);
If you forget to add the Include
and your DbContext is set up to support Lazy Loading, then the navigation properties can still function so long as the DbContext is still accessible and not disposed. Accessing a navigation property that wasn't eager loaded can trigger a call back to the database to "lazy load" the related data. This should generally be avoided as it can lead to some pretty steep performance penalties, especially when dealing with collections of entities and their related data. If lazy loading is disabled then this can result in related data being left #null, or potentially incomplete/unreliable as EF will still populate whatever related entities it might be tracking at the time.
You can also be more selective so rather than loading everything about customers and/or Orders, you can summarize or be selective about the data being pulled using commands like Select
. (Called "Projection")
var customers = context.Customers
.Where(c => c.Orders.Any(o => o.OrderDate >= startDate))
.Select(c => new CustomerSummaryDto
{
CustomerId = c.CustomerId,
OrderCount = c.Orders.Count(o => o.OrderDate >= startDate),
RecentOrders = c.Orders
.Where(o => o.OrderDate >= startDate)
.OrderByDescending(o => o.OrderDate)
.Select(o => new OrderSummaryDto
{
OrderId = o.OrderId,
OrderDate = o.OrderDate,
Value = o.Value
}).Take(5).ToList()
}).ToList();
For instance to get a summary of customers that have orders after a given date, listing a few fields from customers, a count of orders from that date, and the summaries of (up to) 5 most recent orders. Notice that there are no Include
or Join
statements needed at all. EF is configured with how the entities and underlying tables are related and will generate the SQL needed to fetch the data we want based on the Linq expression we build. When reading data I highly recommend understanding and using projection to get data as this can greatly reduce the amount of data that needs to be transmitted and tracked by the DbContext, and avoids pitfalls of eager vs. lazy loading.