1

In their paper describing Viola-Jones object detection framework (Robust Real-Time Face Detection by Viola and Jones), it is said:

All example sub-windows used for training were variance normalized to minimize the effect of different lighting conditions.

My question is "How to implement image normalization in Octave?"

I'm NOT looking for the specific implementation that Viola & Jones used but a similar one that produces almost the same output. I've been following a lot of haar-training tutorials(trying to detect a hand) but not yet able to output a good detector(xml).

I've tried contacting the authors, but still no response yet.

amit
  • 175,853
  • 27
  • 231
  • 333
Koji Ikehara
  • 117
  • 2
  • 9

1 Answers1

2

I already answered how to to it in general guidelines in this thread.

Here is how to do method 1 (normalizing to standard normal deviation) in octave (Demonstrating for a random matrix A, of course can be applied to any matrix, which is how the picture is represented):

>>A = rand(5,5)
A =

   0.078558   0.856690   0.077673   0.038482   0.125593
   0.272183   0.091885   0.495691   0.313981   0.198931
   0.287203   0.779104   0.301254   0.118286   0.252514
   0.508187   0.893055   0.797877   0.668184   0.402121
   0.319055   0.245784   0.324384   0.519099   0.352954

>>s = std(A(:))
s =  0.25628
>>u = mean(A(:))
u =  0.37275
>>A_norn = (A - u) / s
A_norn =

  -1.147939   1.888350  -1.151395  -1.304320  -0.964411
  -0.392411  -1.095939   0.479722  -0.229316  -0.678241
  -0.333804   1.585607  -0.278976  -0.992922  -0.469159
   0.528481   2.030247   1.658861   1.152795   0.114610
  -0.209517  -0.495419  -0.188723   0.571062  -0.077241

In the above you use:

  • To get the standard deviation of the matrix: s = std(A(:))
  • To get the mean value of the matrix: u = mean(A(:))
  • And then following the formula A'[i][j] = (A[i][j] - u)/s with the vectorized version: A_norm = (A - u) / s

Normalizing it with vector normalization is also simple:

>>abs = sqrt((A(:))' * (A(:)))
abs =  2.2472
>>A_norm = A / abs
A_norm =

   0.034959   0.381229   0.034565   0.017124   0.055889
   0.121122   0.040889   0.220583   0.139722   0.088525
   0.127806   0.346703   0.134059   0.052637   0.112369
   0.226144   0.397411   0.355057   0.297343   0.178945
   0.141980   0.109375   0.144351   0.231000   0.157065

In the abvove:

  • abs is the absolute value of the vector (its length), which is calculated with vectorized multiplications (A(:)' * A(:) is actually sum(A[i][j]^2))
  • Then we use it to normalize the vector so it will be of length 1.
Community
  • 1
  • 1
amit
  • 175,853
  • 27
  • 231
  • 333
  • Hi! I was able to do it in Octave. The output is a matrix with values 0 to 2 (depending on the size of the picture, in my case 20 by 20). But when I tried to save it "using >imwrite(norm, "output.pgm")" the picture was all black. Any thoughts? Thanks in advance! – Koji Ikehara Dec 26 '12 at 15:02
  • 1
    @KojiIkehara: It is expected to be all black, the range [0,2] is fairly small (low numbers), and you get very low visible variety. This processing does not make the image "better" for humans, it makes it better for algorithm to analyze it. – amit Dec 26 '12 at 19:47
  • Hi! Thanks! I have another question, do you know if they used these "all black pictures" as the raw data for the training? Or the algorithm will take care of it? – Koji Ikehara Jan 07 '13 at 16:04