0

Could somebody explain how rewrite this SQL query to linq expression. What I'm trying to achieve: there is a jobs table which may have rows be active or not (is active bool column), also there is batch table which works as history for jobs runned. If user create new job in history table will be no records for it and I should get a null batch id but not null jobId in my app. Also it crucial i must get only max value from batch table for each job. i.e there is only one job with Id 1 in batch table there are PK with id 1 2 3 4. I need only one row with id 4.

select distinct on (b.type_id)
t.id as "JobId", b.id , t.interval_minutes 
from types t 
left join batches b on t.id = b.type_id 
where t.is_active 
order by t.id , b.id desc

I can't figure out how to write distinct on part. There is a issue on github with example but I don't understand how to integrate it to my code. My code:

(from types in Types
join batch in Batches on types.Id equals batch.TypeId into joinBatch
from jBatch in joinBatch.DefaultIfEmpty()           
where types.IsActive
orderby types.Id, jBatch.Id descending    
select new DTO() {
    JobId = types.Id,
    ExecTime = types.IntervalMinutes,
    BatchId = jBatch.Id,
})
//.GroupBy(b => b.BatchId).Select(g => g.First()) //getting here an exception if batchid is null
.Dump();

public class DTO{
    public int JobId {get; set;}
    public int ExecTime {get; set;}
    public long? BatchId {get; set;}
}

This code generates:

SELECT p.id AS "JobId", p.interval_minutes AS "ExecTime", b.id AS "BatchId"
FROM types AS p
LEFT JOIN batches AS b ON p.id = b.type_id
WHERE p.is_active
ORDER BY p.id, b.id DESC
GO
Ram
  • 131
  • 8
TryHard
  • 125
  • 9
  • 1
    Could you please provide a job and batch table record sample so it will be easy to understand @TryHard – Govind Dec 13 '22 at 06:13
  • 1
    1) please post exception you get for your attempt 2) `DISTINCT ON` is not supported AFAIK (the groupby would be translated into group by as I understand). – Guru Stron Dec 13 '22 at 06:25
  • 1
    Are you using EF Core? Some other ORM? LINQ isn't SQL to begin with and doesn't query databases directly. ORMs like EF Core translate LINQ queries to SQL based on their entities and relations. `Batch` should have a `Type` property and `Type` should have `Batches`. No joins are needed in this case. The equivalent of `DISTINCT ON` is `DistinctBy` but it's up to the NpgSQL EF Core provider to translate `DistinctBy` to `DISTINCT ON` – Panagiotis Kanavos Dec 13 '22 at 08:59
  • 1
    [This NpgSQL issue](https://github.com/npgsql/efcore.pg/issues/894) shows that DistinctBy isn't working yet but ` blogs.GroupBy(b => b.Id).Select(g => g.First());` should work – Panagiotis Kanavos Dec 13 '22 at 09:01

2 Answers2

1
If I understand your query correctly 

enter image description here

then You can do this.

   public class Types
    {
        public int Id { get; set; }
        public bool IsActive { get; set; }
        public int IntervalMinutes { get; set; }
    }
    public class Batches
    {
        public int Id { get; set; }
        public int TypeId { get; set; }

    }
    public class DTO
    {
        public int JobId { get; set; }
        public int ExecTime { get; set; }
        public long? BatchId { get; set; }
    }
        private void GetDistinctList()
        {
            List<Types> TypesLst = new List<Types>()
            {
                new Types() { Id = 1, IntervalMinutes = 10,IsActive=true },
                new Types() { Id = 2, IntervalMinutes = 15 ,IsActive=true}
            };
            List<Batches> BatchesLst = new List<Batches>()
            {
                new Batches() { Id = 1, TypeId=1 },
                new Batches() { Id = 2, TypeId=1 },
                new Batches() { Id = 3, TypeId=1 },
                new Batches() { Id = 4, TypeId=1 }
            };


            var dtoList = (from types in TypesLst
                 join batch in BatchesLst on types.Id equals batch.TypeId into joinBatch
                 from jBatch in joinBatch.DefaultIfEmpty()
                 where types.IsActive
                 select new DTO()
                 {
                     JobId = types.Id,
                     ExecTime = types.IntervalMinutes,
                     BatchId = jBatch == null ? null : jBatch.Id
                 })
                 .OrderBy(d => d.JobId)
                 .ThenByDescending(z => z.BatchId)
                .GroupBy(b => b.JobId).Select(g => g.FirstOrDefault());

enter image description here

Govind
  • 186
  • 5
  • Can't figue it out but getting in linqpad an exception Queries performing 'LastOrDefault' operation must have a deterministic sort order. Rewrite the query to apply an 'OrderBy' operation on the sequence before calling 'LastOrDefault'. – TryHard Dec 13 '22 at 08:22
  • I used order by and removed LastOrDefault from the code. I am not getting any exceptions. Could you try this? find on the above @TryHard – Govind Dec 13 '22 at 08:51
  • I think there is diference between interpritation of linq to sql and linq to object. Your code works fine for in memory collection. But when i'm trying to use it with EF core, part with order by ignored. – TryHard Dec 13 '22 at 09:37
  • I found a problem. Your row .GroupBy(b => b.JobId).Select(g => g.FirstOrDefault()); should looks like .GroupBy(b => b.JobId).Select(g => g.OrderByDescending(x => x.BatchId).FirstOrDefault() and then EF works as it should. – TryHard Dec 13 '22 at 09:55
  • You can modify the above code as per your requirement. Great @TryHard.. Thanks for your update. – Govind Dec 13 '22 at 09:59
0

you can use DistinctBy() method where it works to eliminate duplicate records based on parameter.

for example if you want to remove duplicate records without params:

(from types in Types
join batch in Batches on types.Id equals batch.TypeId into joinBatch
from jBatch in joinBatch.DefaultIfEmpty()           
where types.IsActive
orderby types.Id, jBatch.Id descending    
select new DTO() {
    JobId = types.Id,
    ExecTime = types.IntervalMinutes,
    BatchId = jBatch.Id,
}).Distinct()

if you want to remove duplicates with specific primitive param

(from types in Types
join batch in Batches on types.Id equals batch.TypeId into joinBatch
from jBatch in joinBatch.DefaultIfEmpty()           
where types.IsActive
orderby types.Id, jBatch.Id descending    
select new DTO() {
    JobId = types.Id,
    ExecTime = types.IntervalMinutes,
    BatchId = jBatch.Id,
}).DistinctBy(p => p.JobId)

if you want remove duplicates by passing complex params

(from types in Types
    join batch in Batches on types.Id equals batch.TypeId into joinBatch
    from jBatch in joinBatch.DefaultIfEmpty()           
    where types.IsActive
    orderby types.Id, jBatch.Id descending    
    select new DTO() {
        JobId = types.Id,
        ExecTime = types.IntervalMinutes,
        BatchId = jBatch.Id,
    }).DistinctBy(p => new {p.JobId, p.BatchId})

for more go through LINQ's Distinct() on a particular property

Sangeetha
  • 24
  • 4
  • I'm getting an error when adding DistinctBy: DistinctBy(p => p.JobId)' could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'. See – TryHard Dec 13 '22 at 08:53
  • Obviously OP is using LINQ for ORM with high chance that it is EF Core. AFAIK ATM EF Core does not support `DistinctBy`. – Guru Stron Dec 13 '22 at 10:10