I have this query in SQL:
SELECT *
FROM TableName
WHERE myData IN (SELECT MAX(myData) AS DATA_MAX
FROM TableName
GROUP BY id1, id2)
I want replicate it in Linq (c#) - how can I do that?
I have this query in SQL:
SELECT *
FROM TableName
WHERE myData IN (SELECT MAX(myData) AS DATA_MAX
FROM TableName
GROUP BY id1, id2)
I want replicate it in Linq (c#) - how can I do that?
This isn't really a direct answer because it doesn't implement it via LINQ; but it does solve the problem, with the minimum amount of fuss:
You can use tools like "Dapper" to execute raw queries without involving any LINQ. If you're using something like LINQ-to-SQL or Entity Framework, the data-context there also usually has a raw query API that you can use, but I'm going to show a "Dapper" implementation:
class SomeType
{
// not shown: properties that look like the columns
// of [TableName] in the database - correct names/types
}
...
var data = connection.Query<SomeType>(@"
SELECT * FROM TableName
WHERE myData IN (Select max(myData) as DATA_MAX from TableName group
by id1, id2)").AsList();
This approach makes it very easy to migrate existing SQL queries without having to rewrite everything as LINQ.
If you are using LINQ-to-SQL, DataContext
has a similiar ExecuteQuery<TResult>
method. Entity Framework has a SqlQuery
method
You can try this. May be it will work.
var myData = (from c in _context.TableName
group c by new
{
c.id1,
c.id2
} into gcs
select new
{
gcs.Max(p=>p.myData)
}).AsQueryable();
var result = (from t in _context.TableName
where myData.Contains(t.myData)
select t).ToList();
Long story short - don't use LINQ, optimize the query and use a microORM like Dapper to map results to classes :
var query = "Select * "
"from ( select *, " +
" ROW_NUMBER() OVER (partition by id1,id2 order by mydata desc) AS RN " +
" From TableName ) T " +
"where RN=1";
var data = connection.Query<SomeType>(query);
LINQ isn't a replacement for SQL. ORMs in general aren't meant to write reporting queries like this one.
Reporting queries need a lot of optimization and usually have to change in production. You don't want to have to redeploy your application each time a query changes. In this case it's far better to create a view and map to it using a microOMR like Dapper.
This specific query could require two table scans, one to calculate the maximum per id1,id2
and one to find the rows with matching mydata
. The intermediate data would have to be spooled into tempdb too. If mydata
is covered by an index, it may not be such an expensive query. If it isn't, all the data will be scanned twice.
An alternative is to calculate the ranking of each row by mydata
based on id1, id2. You can do this with one of the ranking functions like ROW_NUMBER, RANK, NTILE.
Select *
from ( select *,
ROW_NUMBER() OVER (partition by id1,id2 order by mydata desc) AS RN
From TableName) T
where RN=1
You can use that query directly with Dapper or create a view and map your entities to the view, not the table itself.
One option would be to crate a MyTableRanked
view :
CREATE VIEW MyTableRanked AS
select *,
ROW_NUMBER() OVER (partition by id1,id2 order by mydata desc) AS RN
From TableName
This would allow you to write :
var query="Select * from MyTableRanked where RN=@rank";
var data = connection.Query<SomeType>(query,new {rank=2});
Allowing you to return the top N records per ID1,ID2 combination