The problem is that the foreach loop only creates a single course
variable for all of the loop iterations, and this single variable is then captured into a closure. Also remember that the filters aren't actually executed until after the loop. Put those together, and by the time the filters execute this single course
variable has advanced to the last item in the Courses filter; you only check against that one last course.
I see four ways to fix the problem.
First
Create a new variable for each iteration of the loop (this is probably your best quick fix)
IQueryable<Student> filteredStudents = context.Students;
foreach (Course course in filter.Courses) {
if (course != null) {
int CourseID = course.CourseID;
filteredStudents = filteredStudents
.Where(s => s.Courses.Select(c => c.CourseID).Contains(CourseID));
}
}
List<Student> studentList = filteredStudents.ToList<Student>();
Second
Resolve the IEnumerable expression inside the loop (probably much less efficient):
IEnumerable<Student> filteredStudents = context.Students;
foreach (Course course in filter.Courses) {
if (course != null) {
filteredStudents = filteredStudents
.Where(s => s.Courses.Select(c => c.CourseID).Contains(course.CourseID))
.ToList();
}
}
List<Student> studentList = filteredStudents.ToList<Student>();
Third
Use more appropriate linq operators/lambda expressions to eliminate the foreach loop:
var studentList = context.Students.Where(s => s.Courses.Select(c => c.CourseID).Intersect(filter.Courses.Select(c => c.CourseID)).Any()).ToList();
Or in a more readable way:
IQueryable<Student> filteredStudents = context.Students;
var courses = filter.Courses.Select(c => c.CourseID);
var studentList = filteredStudents
.Where(s => s.Courses.Select(c => c.CourseID)
.Intersect(courses)
.Any()
).ToList();
If you play with this a bit, the performance should meet or far exceed the foreach loop through clever internal use of HashSets or — if you're very lucky — by sending a JOIN query to the DB). Just be careful, because it's easy to write something here that make numerous "extra" calls out to the DB inside that Intersect()
or Any()
method. Even so, this is the option I tend to prefer, with the exception that I probably wouldn't bother to call .ToList()
at the end.
This also illustrates why I don't have much use for ORMs like Entity Framework, linq-to-sql, or even NHibernate or ActiveRecord. If I'm just writing SQL, I can know I'm getting the correct join query. I could do that with an ORM, too, but now I still need to know about the specific SQL I'm creating, and I also have to know how to get the ORM to do what I want.
Fourth
Use C# 5.0. This is fixed in the most recent version of C#, so that each iteration of a for/foreach loop is its own variable.