3

If I use the following code, I get a list of students that study course 1 and course 2. (This is almost what I want.)

IQueryable<Student> filteredStudents = context.Students;
filteredStudents = filteredStudents
    .Where(s => s.Courses.Select(c => c.CourseID).Contains(1));
filteredStudents = filteredStudents
    .Where(s => s.Courses.Select(c => c.CourseID).Contains(2));
List<Student> studentList = filteredStudents.ToList<Student>();  

However, if I try and do this in a foreach loop (as shown in the following code) then I get a list of all students that are signed up for the last course in the loop.

IQueryable<Student> filteredStudents = context.Students;
foreach (Course course in filter.Courses) {
    if (course != null) {             
        filteredStudents = filteredStudents
            .Where(s => s.Courses.Select(c => c.CourseID).Contains(course.CourseID));
    }
}
List<Student> studentList = filteredStudents.ToList<Student>();

This behaviour confuses me. Can anyone explain why this is happening? And how to get around it? Thank you.

christiaantober
  • 251
  • 3
  • 10
  • 1
    the `.Select(s => s)` is unneeded by the way, `Student` is already selected by default – jjj May 22 '15 at 19:18

4 Answers4

4

The problem is that the foreach loop only creates a single course variable for all of the loop iterations, and this single variable is then captured into a closure. Also remember that the filters aren't actually executed until after the loop. Put those together, and by the time the filters execute this single course variable has advanced to the last item in the Courses filter; you only check against that one last course.

I see four ways to fix the problem.

First

Create a new variable for each iteration of the loop (this is probably your best quick fix)

IQueryable<Student> filteredStudents = context.Students;
foreach (Course course in filter.Courses) {
    if (course != null) {  
        int CourseID = course.CourseID;            
        filteredStudents = filteredStudents
            .Where(s => s.Courses.Select(c => c.CourseID).Contains(CourseID));
    }
}
List<Student> studentList = filteredStudents.ToList<Student>();

Second

Resolve the IEnumerable expression inside the loop (probably much less efficient):

IEnumerable<Student> filteredStudents = context.Students;
foreach (Course course in filter.Courses) {
    if (course != null) {             
        filteredStudents = filteredStudents
            .Where(s => s.Courses.Select(c => c.CourseID).Contains(course.CourseID))
            .ToList(); 
    }
}
List<Student> studentList = filteredStudents.ToList<Student>();

Third

Use more appropriate linq operators/lambda expressions to eliminate the foreach loop:

var studentList = context.Students.Where(s => s.Courses.Select(c => c.CourseID).Intersect(filter.Courses.Select(c => c.CourseID)).Any()).ToList();

Or in a more readable way:

IQueryable<Student> filteredStudents = context.Students;
var courses = filter.Courses.Select(c => c.CourseID);
var studentList = filteredStudents
       .Where(s => s.Courses.Select(c => c.CourseID)
                       .Intersect(courses)
                       .Any()
       ).ToList();

If you play with this a bit, the performance should meet or far exceed the foreach loop through clever internal use of HashSets or — if you're very lucky — by sending a JOIN query to the DB). Just be careful, because it's easy to write something here that make numerous "extra" calls out to the DB inside that Intersect() or Any() method. Even so, this is the option I tend to prefer, with the exception that I probably wouldn't bother to call .ToList() at the end.

This also illustrates why I don't have much use for ORMs like Entity Framework, linq-to-sql, or even NHibernate or ActiveRecord. If I'm just writing SQL, I can know I'm getting the correct join query. I could do that with an ORM, too, but now I still need to know about the specific SQL I'm creating, and I also have to know how to get the ORM to do what I want.

Fourth

Use C# 5.0. This is fixed in the most recent version of C#, so that each iteration of a for/foreach loop is its own variable.

Ry-
  • 218,210
  • 55
  • 464
  • 476
Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
  • I just tried your third method, but it doesn't give the desired result. It returns Students that have "any" of the courses in the filter. I would like it to return Students that have "all" of the courses in the filter. – christiaantober Jun 06 '15 at 22:13
1

If you're trying to get every Student that is enrolled in every course in filter.Courses, you could try:

var courseIDs = filter.Courses.Select(c => c.CourseID);
var filteredStudents = context.Students
    .Where(s => !courseIDs.Except(s.Courses.Select(c => c.CourseId)).Any())

which filters on whether courseIDs is a subset of a Student's course IDs.

Edit

Joel Coehoorn and Mikael Eliasson give a good explanation for why all of the students in the last course were retrieved.

Community
  • 1
  • 1
jjj
  • 4,822
  • 1
  • 16
  • 39
0

Because "filteredStudents = filteredStudents.Where..." is a straight assignment to the variable, each time through the loop you are completely replacing what was in it before. you need to append to it, not replace. Try a search for "c# AddRange"

Christopher
  • 134
  • 7
0

I don't think this has to do with Entity Framework. It's a bug (not really, but a stupid design in c#) where the variable is declared outside the loop.

In this case it means that because the IEnumerable is lazily evaluated it will use the LAST value of the variable. Use a temp variable in the loop to solve it.

foreach (Course course in filter.Courses) {
    if (course != null) {
        var cId = course.CourseID;       
        filteredStudents = filteredStudents
            .Where(s => s.Courses.Select(c => c.CourseID).Contains(cId))
                .Select(s => s);
    }
}

Or even better if you have the navigation properties properly defined. Just do:

var studentList = filter.Courses.SelectMany(c => c.Students).ToList()

See more info here: Is there a reason for C#'s reuse of the variable in a foreach?

Community
  • 1
  • 1
Mikael Eliasson
  • 5,157
  • 23
  • 27