2

I have the following tables mapped with Entity Framework:

Invoices (InvoiceID, InvoiceNumber, IsPaid)
Parts (PartID, PartNumber, OrganisationID, IsActive)
Invoices_Parts (InvoiceID, PartID, Quantity, IsClaimed)

I'm trying to get a List of InvoiceIDs from the Invoices_Parts table with a number of conditions. The condition I'm having a problem with is "where any Part from Invoices_Parts is active".

The code so far:

IQueryable<int> partIds = db.Parts.Where(x => 
    x.OrganisationID == loggedInUser.OrganisationID).Select(y => y.PartID);
    // && x.IsActive

List<string> invoiceIds = db.Invoices_Parts.Where(x => 
    !x.IsClaimed && x.Invoice.IsPaid && partIds.Contains(x.PartID)
    .DistinctBy(y => y.InvoiceID)
    .Select(z => z.InvoiceID.ToString()).ToList();

I've commented out "&& x.IsActive" because I want the list of InvoiceIDs to be such that at least one part must meet the IsActive condition - I don't want to filter out all the parts where IsActive is false. How do I achieve this in LINQ without manually looping through collections and adding/removing items?

Note: In case anyone is wondering, I'm using the following helper method for DistinctBy:

public static IEnumerable<TSource> DistinctBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
    HashSet<TKey> seenKeys = new HashSet<TKey>();
    foreach (TSource element in source)
    {
        if (seenKeys.Add(keySelector(element)))
        {
            yield return element;
        }
    }
}

Edit: I have the following properties on each entity:

Invoice:

public virtual ICollection<Invoices_Parts> Invoices_Parts { get; set; }

Part:

public virtual ICollection<Invoices_Parts> Invoices_Parts { get; set; }

Invoices_Parts:

public virtual Invoice Invoice { get; set; }
public virtual Part Part { get; set; }
Servy
  • 202,030
  • 26
  • 332
  • 449
notAnonymousAnymore
  • 2,637
  • 9
  • 49
  • 74
  • Do you have some navigation properties in your entities? – Sergey Berezovskiy Jan 30 '14 at 18:07
  • 3
    You might be aware of this but I should point out that your IQueryable will fire when it gets to the `foreach` in your `DistinctBy` extension. I would say that for clarity's sake you should put a `ToList()` before the `DistinctBy()` call in the chain to make clear that is where the query evaluates. – George Mauer Jan 30 '14 at 18:07
  • 2
    @GeorgeMauer Adding a `ToList` needlessly evaluates the entire query immediately, forces the entire result set to be evaluated, and has the overhead of creating/populating a list that is going to be completely disregarded. `AsEnumerable` accomplishes the desired readability semantics without any additional overhead at all. – Servy Jan 30 '14 at 18:09
  • I am really not sure what the OP is asking. Can we maybe get some example data? Or perhaps what your unexpected results are? As @AnnL mentions further down it kind of looks like you had what you wanted already. – George Mauer Jan 30 '14 at 18:12
  • 1
    Actually, it looks to me as if, were you to put the `IsActive` test back, you'd have exactly what you want. You're looking for the `InvoiceID` values, so pulling the `InvoiceID` off all the `Invoice_Parts` records where `IsActive` is true and the other things are true should get you exactly that. – Ann L. Jan 30 '14 at 18:12
  • @Servy that's true but unless I'm mistaken `AsEnumerable` won't fire the query and I recommend that he actually fire the query at that point but yes it would have a similar semantic meaning. – George Mauer Jan 30 '14 at 18:14
  • I am not clear on what `DistinctBy` is supposed to do. Can you explain? – Ann L. Jan 30 '14 at 18:18
  • @GeorgeMauer And I'm saying you shouldn't. There's no reason to evaluate the query at that point. It's preferable not to. Let it be evaluated when it's needed; that's the whole idea of LINQ. By not forcing it into a list you also allow the results to be streamed; rather than requiring every single item before continuing, thus reducing throughput. – Servy Jan 30 '14 at 18:18
  • @AnnL. Not that it matters for the sake of the question, but it performs a `Distinct` operation on a projected value of the object. – Servy Jan 30 '14 at 18:18
  • 1
    Well yes, except that the `DistinctBy` extension will *already* force the query to fire. What I'm specifically objecting to is that it looks like it won't (because `Distinct()` is implemented in the LINQ provider), but it actually does (because `DistinctBy` is not). So I guess maybe a better suggestion would be to rename it to `EvaluateAndDistinctBy` – George Mauer Jan 30 '14 at 18:25

3 Answers3

5

People often overcomplicate LINQ queries because they're so used to thinking in terms of SQL tables, joins, and such. Usually if you put your actual requirement in plain english, you can come up with a simpler query that reads almost exactly the same.

I want a list of invoice IDs from the paid invoices where any part on that invoice is active, and ... etc.

How about this:

from invoice in db.Invoices
where invoice.IsPaid
where invoice.Invoices_Parts.Any(ip => !ip.IsClaimed && 
    ip.Part.IsActive && 
    ip.OrganisationID == loggedInUser.OrganisationID)
select invoice.InvoiceId

Or if you prefer method syntax:

db.Invoices.Where(i => i.IsPaid)
.Where(i => i.Invoices_Parts.Any(ip => !ip.IsClaimed && 
    ip.Part.IsActive && 
    ip.OrganisationID == loggedInUser.OrganisationID)
.Select(i => i.InvoiceId)
.ToList();

PS--you can do a .ToString() if you want to, but in my experience it's wiser to leave IDs strongly typed.

PPS--you can make a DistinctBy method that will play nicely with Entity Framework (not forcing premature evaluation) like so:

public static IQueryable<TSource> DistinctBy<TSource, TKey>(this IQueryable<TSource> source, Excpression<Func<TSource, TKey>> keySelector) {
    return source.GroupBy(keySelector).Select(i => i.FirstOrDefault());
}
StriplingWarrior
  • 151,543
  • 27
  • 246
  • 315
1

OK, assuming that you want all the invoices where at least one part ON THAT INVOICE is active, I'd do it like this:

IQueryable<int> partIds = db.Parts
  .Where(x => x.OrganisationID == loggedInUser.OrganisationID && x.IsActive)
  .Select(y => y.PartID);


List<string> invoiceIds = db.Invoices_Parts
   .Where(x =>  !x.IsClaimed && x.Invoice.IsPaid && partIds.Contains(x.PartID))
   .Select(y => y.InvoiceID.ToString())
   .Distinct()
   .ToList();

If that looks a lot like what you have, it's because it is. As I understood the question, you were already 99.99% there.

Ann L.
  • 13,760
  • 5
  • 35
  • 66
  • I probably didn't explain myself properly in the question. I also used arbitrary field names because the actual data sets had lots more fields. – notAnonymousAnymore Jan 31 '14 at 14:43
0

What you can do here is join the invoices with the parts to create a collection of all of the parts for each invoice. Once you have that collection determining if any item in it is active is easy enough:

List<string> invoiceIds = 
    (from invoice in db.Invoices_Parts
    where !invoice.IsClaimed && invoice.Invoice.IsPaid
    join part in partIds
    on invoice.PartId equals part into parts
    where parts.Any(part => part.IsActive)
    select invoice.InvoiceID)
    .Distinct()
    .ToList();
Servy
  • 202,030
  • 26
  • 332
  • 449