I have read the viola paper from 2004. In 3.1 they explain the threshold calculation. But I am super confused. It reads as
For each feature, the examples are sorted based on feature value
Question1) Is sorted list a list of haar feature values calculated from integral image of examples. So if we have a feature and 10 images(positive and negative). we get 10 results associated with each input image.
The AdaBoost optimal threshold for that feature can then be computed in a single pass over this sorted list. For each element in the sorted list, four sums are maintained and evaluated: the total sum of positive example weights T +, the total sum of negative example weights T −, the sum of positive weights below the current example S+ and the sum of negative weights below the current example S−
Question 2) what is the purpose of sorting. I guess the one with the highest is the one describes the image best. But algorithmically how does it affect (S- S+ T+ T-).
Question3) Now for a sorted list we calculate (S- S+ T+ T-). Does this mean each entry holds its own (S- S+ T- T+) or is there only One (S- S+ T- T+) for the whole list.
The error for a threshold which splits the range between the current and previous example in the sorted list is: e = min ( S+ + (T − − S−), S− + (T + − S+)) ,
Question4) This somewhat answers my previous question but I am not sure. So in order for us have "e "for each input image. We need to maintain (S- S+ T- T+) for each entry in the list. But what do we do with "e" after we calculate N of them (one for each image) for that feature.
Thanks in advance, Please let me know if this is confusing or you need more clarification for my questions.