I'm trying to understand the algorithm behind the super fast blur algorithm. Below is the port to java that works with android as a test. Looks like this version makes some optimisations that I don't quite understand and there aren't any comments either.
void fastblur(Bitmap img, int radius){
if (radius<1){
return;
}
int w= img.getWidth();
int h=img.getHeight();
int wm=w-1;
int hm=h-1;
int wh=w*h;
int div=radius+radius+1;
int r[]=new int[wh];
int g[]=new int[wh];
int b[]=new int[wh];
int rsum,gsum,bsum,x,y,i,p,p1,p2,yp,yi,yw;
int vmin[] = new int[Math.max(w,h)];
int vmax[] = new int[Math.max(w,h)];
int[] pix= new int[w*h];
img.getPixels(pix, 0, w, 0,0,w, h);
int dv[]=new int[256*div];
for (i=0;i<256*div;i++){
dv[i]=(i/div);
}
yw=yi=0;
for (y=0;y<h;y++){
rsum=gsum=bsum=0;
for(i=-radius;i<=radius;i++){
p=pix[yi+Math.min(wm,Math.max(i,0))];
rsum+=(p & 0xff0000)>>16;
gsum+=(p & 0x00ff00)>>8;
bsum+= p & 0x0000ff;
}
for (x=0;x<w;x++){
r[yi]=dv[rsum];
g[yi]=dv[gsum];
b[yi]=dv[bsum];
if(y==0){
vmin[x]=Math.min(x+radius+1,wm);
vmax[x]=Math.max(x-radius,0);
}
p1=pix[yw+vmin[x]];
p2=pix[yw+vmax[x]];
rsum+=((p1 & 0xff0000)-(p2 & 0xff0000))>>16;
gsum+=((p1 & 0x00ff00)-(p2 & 0x00ff00))>>8;
bsum+= (p1 & 0x0000ff)-(p2 & 0x0000ff);
yi++;
}
yw+=w;
}
for (x=0;x<w;x++){
rsum=gsum=bsum=0;
yp=-radius*w;
for(i=-radius;i<=radius;i++){
yi=Math.max(0,yp)+x;
rsum+=r[yi];
gsum+=g[yi];
bsum+=b[yi];
yp+=w;
}
yi=x;
for (y=0;y<h;y++){
pix[yi]=0xff000000 | (dv[rsum]<<16) | (dv[gsum]<<8) | dv[bsum];
if(x==0){
vmin[y]=Math.min(y+radius+1,hm)*w;
vmax[y]=Math.max(y-radius,0)*w;
}
p1=x+vmin[y];
p2=x+vmax[y];
rsum+=r[p1]-r[p2];
gsum+=g[p1]-g[p2];
bsum+=b[p1]-b[p2];
yi+=w;
}
}
img.setPixels(pix,0, w,0,0,w,h);
}
Correct me if I'm wrong by my speculations:
What does the below loop do? Is it associated with pre-computing the kernel table? What about div, is that the kernel table size? I guess what I'm trying to ask is, what is dv[] supposed to store?
int dv[]=new int[256*div];
for (i=0;i<256*div;i++){
dv[i]=(i/div);
}
Looking at the horizontal pass:
The below loop looks like it's summing up the separate RGB values, but it only does this at the starting pixel for each row, since yi
is only incremented once we finish processing all pixels up until the width is reached. Is this because we end up adding to the RGB sums as we process the pixels in the next loop?
for(i=-radius;i<=radius;i++){
int ind = yi+Math.min(wm,Math.max(i,0));
p=pix[ind];
rsum+=(p & 0xff0000)>>16;
gsum+=(p & 0x00ff00)>>8;
bsum+= p & 0x0000ff;
}
Are we only selecting the left most pixel and right most pixel according to the radius and the current pixel position?
if(y==0){
vmin[x]=Math.min(x+radius+1,wm);
vmax[x]=Math.max(x-radius,0);
}
p1=pix[yw+vmin[x]];
p2=pix[yw+vmax[x]];
Next is what is confusing me the most: Am I correct to say that were getting the difference between right and left pixels and adding that the running RGB totals that we have?
rsum+=((p1 & 0xff0000)-(p2 & 0xff0000))>>16;
gsum+=((p1 & 0x00ff00)-(p2 & 0x00ff00))>>8;
bsum+= (p1 & 0x0000ff)-(p2 & 0x0000ff);
I haven't had a look at the second pass since this is pretty much going over my head. Any clarification would be appreciated and any commentary on the loop on the vertical pass would be helpful as well thanks.