1

I am having performance issues with this LINQ Query. The data is laoded into to this.students already. Now when I call the GetStudentData function say 1000 times it has a huge overhead. Is there a way of improving this without changing the LINQ to a loop

   public Student GetStudentData()
   {         
          IEnumerable<Students> studentTypes = this.students.Where(x => (x.studentsId  == studentId && x.StduentType.Equals(studentType)));
          if(studentTypes.Count==0) return new Student() { studentid=studentID};
          return (Student)studentTypes.First();
   }

So here are the results when looping through it 10000 times with my original version

Original Version : 5.6 seconds on the average New Version @des's Code with FirstOrDefault : 3.6 seconds

Zbigniew
  • 27,184
  • 6
  • 59
  • 66
abbas
  • 424
  • 3
  • 10
  • 22

3 Answers3

4

When you use Where you loop through all records which fulfill given conditions, when you use First you just search for first record which fullfills condition, so using First should speed it up.

public Student GetStudentData()
{         
    // get first student by id and type, return null if there is no such student
    var student = students.FirstOrDefault(i => i.studentsId == studentId && i.StudentType.Equals(studentType));

    // if student is null then return new student
    return student ?? new Student();
}
Zbigniew
  • 27,184
  • 6
  • 59
  • 66
  • that might simplify the code, but there's no optimization there – Didaxis Jun 27 '12 at 16:35
  • @ErOx check my updated answer, `First` should be faster than `Where` because it won't check all records... – Zbigniew Jun 27 '12 at 16:37
  • Good explanation...let me test and see the results – abbas Jun 27 '12 at 16:39
  • that's not right. why would you think there's a performance difference between .First(anonymous-delegate) and Where(anonymous-delegate).First()? Don't you think the *exact same* filter is being ran in both cases? – Didaxis Jun 27 '12 at 16:44
  • look at the IL, I can assure you it is – Didaxis Jun 27 '12 at 16:45
  • What about this: `var all = students.Where(i => i.studentsId == studentId && i.StudentType.Equals(studentType)); var first = all.First();`? Won't if firstly search all students with those conditions and then take first one? – Zbigniew Jun 27 '12 at 16:47
  • sure, but my point is the same filter is being ran one way or the other, so there's really no optimization there. `First(x => x ...)` is simply a shorthand for `.Where(x => x ... ).First()` – Didaxis Jun 27 '12 at 16:50
  • @ErOx OP's code also contains `Count()` - which completely traverses the enumerator returned from `Where()`. And then `First()` traverses it **again** until the first element. Des's code does only one (and shorter!) traversal instead of two. – Branko Dimitrijevic Jun 27 '12 at 16:53
  • Example: you have a collection { "a", "b", "c" }, when you use `Where(i => i == "a")` you loop through whole collection (so you check "a", "b", "c"), whereas when you use `First(i => i == "a")` you'll only check first element of collection (because condition has been statisfied). Am I not right? – Zbigniew Jun 27 '12 at 16:53
  • It did increase the performance.Let me add the results and modify my code as well....but can it be increased further – abbas Jun 27 '12 at 16:56
  • http://stackoverflow.com/questions/8663897/why-is-linq-wherepredicate-first-faster-than-firstpredicate# <--would suggest you did not see a performance benefit – Didaxis Jun 27 '12 at 16:56
  • @des, no that's not right. see the link to the question above – Didaxis Jun 27 '12 at 16:59
  • @ErOx this is interesting, I didn't know about that. I'll educate myself on this matter later. Perhaps I was wrong, but on the other hand abbas reported performance improvement. I'll test this whenever I'll have some spare time. Thank you for conversation ErOx. – Zbigniew Jun 27 '12 at 17:00
  • no prob, you intuition is absolutely right to think it would be faster. indeed it is interesting that it's not faster – Didaxis Jun 27 '12 at 17:04
  • Erox but my results have improved.Is there any reson for that – abbas Jun 27 '12 at 17:05
  • I suspect you're benchmarking the entire method, in which case yes, you will see a performance benefit (because of the omission of the call to `.Count()`). if you benchmark *just* the change to the linq query, you won't see a performance benefit. – Didaxis Jun 27 '12 at 17:13
  • Thanks for pointing that out...ur right i am benchmarking the entire method...but can you suggest any other way to optimize it – abbas Jun 27 '12 at 17:14
  • @abbas You are trying to fix the symptoms instead of fixing the root problem: using a data structure that is not well suited for the job. If you really care for performance, you'll have to consider a more substantial refactoring of your code. – Branko Dimitrijevic Jun 27 '12 at 17:21
  • The datasructure is a HashSet of students...as i mentioned before this is just a simpler version of the function. – abbas Jun 27 '12 at 17:24
2

Well, the issue is precisely the fact that you are calling this method in a loop, supposedly, 1000 times!

Why not changing the method to receive a list of studentIDs and return the 1000 students in one shot? Something like

var studentTypes = from c in this.students 
                   where studentIDs.Contains(c.StudentID)
                   select c;

Where studentIDs can be an int[] containing the list of student ids you want.

Icarus
  • 63,293
  • 14
  • 100
  • 115
  • I think this is on the right track -looks like there needs to be some refactoring in the OP's logic – Didaxis Jun 27 '12 at 16:37
  • Nopes...This is called with different parameters...this is just a simpler version of the real function...so that its easy to understand. – abbas Jun 27 '12 at 16:38
0

Refactor your code so this.students is a Dictionary<int, Student> (key is StudentId), then reimplement your method similarly to this:

public Student GetStudentData(int studentId, StudentType studentType) {
    Student result;
    if (this.students.TryGetValue(studentId, out result))
        if (result.StudentType.Equals(studentType)))
            return result;
    return new Student { StudentId = studentId };
}

If you absolutely can't refactor this.students, you can always maintain the dictionary in parallel.

Or, you can simply create a temporary dictionary (this.students.ToDictionary(student => student.StudentId)) just before the 1000-iteration loop.

Branko Dimitrijevic
  • 50,809
  • 10
  • 93
  • 167