Writing hierachial queries in Linq to SQL is always a mess, it can work in memory but it doesn't translate to efficient SQL queries, this is a good discussion on on SO about some hierarchial Linq techniques
There are a few options:
- Don't use Linq at all and query from your CTE directly!
- Convert your CTE to a View
- Re-write the query so that you don't need the CTE
- This is easier if you have a fixed or theoretical limit to the recursion.
- Even if you don't want to limit it, if you review the data and find that the highest level of recursion is only 2 or 3, then you could support
How to use a CTE directly in EF 6
DbContext.Database.SqlQuery<TElement>(string sql, params object[] parameters)
Creates a raw SQL query that will return elements of the given type (TElement
). The type can be any type that has properties that match the names of the columns returned from the query
NOTE: Do NOT use select *
for this type (or any) of query, explicitly define the fields that you expect in the output to avoid issues where your query has more columns available than the EF runtime is expecting.
- Perhaps of equal importance, if you want or need to apply filtering to this record set, you should implement the filtering in the raw SQL string value. The entire query must be materialized into memory before EF Linq filtering expressions can be applied
.SqlQuery
does support passing through parameters, which comes in handy for filter expressions ;)
string cteQuery = @"
with cte (Id, Name, Address, Email, PhoneNumber, Date, IdParent, sort) as
(
select Id, Name, Address, Email, PhoneNumber, Date, IdParent,
cast(right('0000' + cast(row_number() over (order by Id) as varchar(5)), 5) as varchar(1024))
from Something
where Id = IdParent
union all
select t.Id, t.Name, t.Address, t.Email, t.PhoneNumber, t.Date, t.IdParent,
cast(c.sort + right('0000' + cast(row_number() over (order by t.Id) as varchar(5)), 5) as varchar(1024))
from cte c
inner join Something t on c.Id = t.IdParent
where t.Id <> t.IdParent
)
select Id, Name, Address, Email, PhoneNumber, Date, IdParent
from cte
order by sort
";
using (var ctx = new MyDBEntities())
{
var list = ctx.Database
.SqlQuery<Something>(cteQuery)
.ToList();
}
- Understanding how and when to use
.SqlQuery
for executing raw SQL comes in handy when you want to squeeze the most performance out of SQL without writing complex Linq statements.
- This comes in handy if you move your CTE into a view or table valued function or a stored procedure, once the results have been materialized into the list in memory, you can treat these records like any other
Convert your CTE to a View
If you are generating your EF model from the database, then you could create a view from your CTE to generate the Something
class, however this becomes a bit disconnected if you also want to perform CRUD operations against the same table, having two classes in the model that represent virtually the same structure is a bit redundant IMO, perfectly valid if you want to work that way though.
- Views cannot have
ORDER BY
statements, so you take this statement out of your view definition, but you still include the sort
column in the output so that you can sort the results in memory.
- Converting your CTE to a view will have the same structure as your current
Something
class, however it will have an additional column called sort
.
How to write the same query without CTE
As I alluded at the start, you can follow this post Hierarchical queries in LINQ to help process the data after bringing the entire list into memory. However in my answer to OPs orginal post, I highlighted how simple self joins on the table can be used to produce the same results, we can easily replicate the self join in EF.
Even when you want to support a theoretically infinitely recursive hierarchy the realty of many datasets is that there is an observable or practical limit to the number of levels. If you can identify that practical limit, and it is a small enough number, then it might be simpler from a C# / Linq perspective to mot bother with the CTE at all
- Put it the other way around, ask yourself this question: "If I set a practical limit of
X
number of levels of recursion, how will that affect my users?"
- Put 4 in for
X
, if the result is that users will not generally be affected, or this data scenario is not likely to occur then lets try it out.
If a limit of 4 is acceptable, then this is your Linq statement:
I've used fluent notation here to demonstrate the relationship to SQL
var list = from child in ctx.Somethings
join parent in ctx.Somethings on child.parentId equals parent.Id
join grandParent in ctx.Somethings on parent.parentId equals grandParent.Id
orderby grandParent.parentId, parent.parentId, child.parentId, child.Id
select child;
I would probably use short hand aliases for this query in production, but the naming convention makes the intended query quickly human relatable.
If you setup a foreign key in the database linking parentId to the Id of the same table, then the Linq side is much simpler
- This should generate a navigation property to enable traversing the foreign key through linq, in the following example this property is called
Parent
var list = ctx.Somethings
.OrderBy(x => x.Parent.Parent.ParentId)
.ThenBy(x => x.Parent.ParentId)
.ThenBy(x => x.ParentId)
.ThenBy(x => x.Id);
You can see in this way, if we can limit the recusion level, the Linq required for sorting based on the recursive parent is quite simple, and syntactically easy to extrapolate to the number of levels you need.
- You could do this for any number of levels, but there is a point where performance might become an issue, or where the number of line of code to achieve this is more than using the
SqlQuery
option presented first.