1

I have two different list types. I need to remove the elements from list1 that is not there in list2 and list2 element satisfies certain criteria.

Here is what I tried, seems to work but each element is listed twice.

   var filteredTracks =
                    from mtrack in mTracks 
                    join ftrack in tracksFileStatus on mtrack.id equals ftrack.Id
                    where mtrack.id == ftrack.Id && ftrack.status == "ONDISK" && ftrack.content_type_id == 234
                    select mtrack;

Ideally I don't want to create a new copy of the filteredTracks, is it possible modify mTracks in place?

srs
  • 516
  • 1
  • 4
  • 20

2 Answers2

4

If you're getting duplicates, it's because your id fields are not unique in one or both of the two sequences. Also, you don't need to say where mtrack.id == ftrack.Id since that condition already has to be met for the join to succeed.

I would probably use loops here, but if you are dead set on LINQ, you may need to group tracksFileStatus by its Id field. It's hard to tell by what you posted.

As far as "modifying mTracks in place", this is probably not possible or worthwhile (I'm assuming that mTracks is some type derived from IEnumerable<T>). If you're worried about the efficiency of this approach, then you may want to consider using another kind of data structure, like a dictionary with Id values as the keys.

We Are All Monica
  • 13,000
  • 8
  • 46
  • 72
  • +1 for explaining source of duplicates. A `Distinct` call could also eliminate the duplicates, but I prefer to know where the duplicates are coming from before considering its use. – devgeezer Apr 03 '12 at 18:34
  • devgeezer: agreed, you should understand your data. Also, depending on the objects' `IEquatable` implementation (if any), it may be necessary to write a custom class deriving from `IEqualityComparer` to use `Distinct()`. – We Are All Monica Apr 03 '12 at 18:37
  • Thanks for the replies! Is it possible to make the list1 as the driver and search in the list2, loops are easy to do as well but I am starting on LINQ and kinda impressed with expressiveness of complex queries using LINQ – srs Apr 03 '12 at 19:50
  • I agree, LINQ is good stuff. To reverse the order, you should just be able to say `from ftrack in tracksFileStatus join mtrack in mTracks ...`. At the end, `select mtrack` should still work fine since you will still have that variable defined and available. – We Are All Monica Apr 03 '12 at 20:35
3

Since the Q was about lists primarily...
this is probably better linq wise...

var test = (from m in mTracks
            from f in fTracks
            where m.Id == f.Id && ...
            select m);

However you should optimize, e.g.
Are your lists sorted? If they are, see e.g. Best algorithm for synchronizing two IList in C# 2.0
If it's coming from Db (it's not clear here), then you need to build your linq query based on the SQL / relations and indexes you have in the Db and go a bit different route.
If I were you, I'd make a query (for each of the lists, presuming it's not Db bound) so that tracks are sorted in the first place (and sort on whatever is used to compare them, usually),
then enumerate in parallel (using enumerators), comparing other things in the process (like in that link).
that's likely the most efficient way.
if/when it comes from database, optimize at the 'source' - i.e. fetch data already sorted and filtered as much as you can. And basically, build an SQL first, or inspect the returned SQL from the linq query (let me know if you need the link).

Community
  • 1
  • 1
NSGaga-mostly-inactive
  • 14,052
  • 3
  • 41
  • 51