I have a Problem understanding the trainingsphase of the Viola Jones algorithm.
I give the algorithm in pseudo code, as far as I understand it:
# learning phase of Viola Jones
foreach feature # these are the pattern, see figure 1, page 139
# these features are moved over the entire 24x24 sample pictures
foreach (x,y) so that the feature still matches the 24x24 sample picture
# the features are scaled over the window from [(x,y) - (24,24)]
foreach scaling of the feature
# calc the best threshold for a single, scaled feature
# for this, the feature is put over each sample image (all 24x24 in the paper)
foreach positive_image
thresh_pos[this positive image] := HaarFeatureCalc(position of the window, scaling, feature)
foreach negative_image
thresh_neg[this negative image] := HaarFeatureCalc(position of the window, scaling, feature)
#### what's next?
#### how do I use the thresholds (pos / neg)?
This is, btw the frame as in this SO Question: Viola-Jones' face detection claims 180k features
This algorithm calls the HaarFeatureCalc-function, which I think I understood:
function: HaarFeatureCalc
threshold := (sum of the pixel in the sample picture that are white in the feature pattern) -
(sum of the pixel in the sample picture that are grey in the feature pattern)
# this is calculated with the integral image, described in 2.1 of the paper
return the threshold
any mistakes till now?
The learning phase of Viola Jones, basically detects which of the features/detectors are the most deciding. I don't understand how the AdaBoost works, that is described in the paper.
Question: how would the AdaBoost from the paper look like in pseudo code?