CVImageBuffer comes back with extra column padding. How do I crop it?

Question

I have a CVImageBuffer that comes back with recorded height of 640px and width of 852px. The bytes per row are 3456. You'll notice that 3456/852px != 4 (it's something like 4.05). After some inspection, 864 would be the width that makes bytesPerRow/width = 4.0. So, it seems like there are an extra 12px on each row (padded on the right). I'm assuming this is because these buffers are optimized for some multiple that this image does not have.

When I render out this buffer in OpenGl it looks terrible (see below). What I noticed is that the pattern repeats every 71px, which makes sense because if there are an extra 12px then (852/12px = 71). So, the extra 12 pixels seem to be causing the problem.

How do I get rid of these extra pixels very quickly and then use this data to read into OpenGL ES? Or rather, how do I read into OpenGL ES by skipping these extra pixels on each row?

enter image description here

See my answer to your other related question here: http://stackoverflow.com/questions/6540710/ios-cvimagebuffer-distorted-from-avcapturesessiondataoutput-with-avcapturesessio/7953253#7953253 — Dex, Oct 31 '11 at 11:45

score 7 · Answer 1 · answered Feb 29 '12 at 02:56

It's pretty common for images that are used together with high speed image processing algorithms to have a padding at the end of each line so that the image has a pixel or byte size that is a multiple of 4, 8, 16, 32, and so on. This makes it much easier to optimize certain algorithms for speed especially in combination with SIMD instruction sets like SSE on x86 or NEON on ARM. In your case the padding is 12 pixels, which means Apple seems to optimize their algorithms for processing 32 pixels per line; 852 is not dividable by 32, 864 is, hence the lines are padded by 12 pixels to maintain a 32 pixel alignment. The correct technical terms are size and stride size or in case of images, width and stride width. The width is the amount of actual pixel data per line, the stride width is the real line size, including pixel data and optional padding at the end of the line.

Standard OpenGL allows to load textures with a stride width bigger than the actual texture width. This is achieved by setting the glPixelStore parameter GL_PACK_ROW_LENGTH accordingly. Note that this "stride padding skip" is usually implemented within the CPU part of the driver, so this no operation performed on the GPU, in fact the driver will removing the extra padding before uploading data to the GPU. As OpenGL ES is designed to run on embedded devices which may have very limited CPU resources available, this option was removed from OpenGL ES to keep driver development simple, even for very weak embedded CPUs. This leaves you with four options to deal with your problem:

Preprocess the texture to remove the padding using a C copy loop, that skips the extra pixels at the end of each line. This implementation is rather slow but easy to implement.
Preprocess the texture as in case of option (1), however use the compiler SIMD macros to make use of NEON instructions. This will be about 2 times faster than option (1) but it's also harder to implement and you'll need some knowledge about NEON instructions and how to use them to achieve this goal.
Preprocess the textures as in case of option (2), however use a pure assembly implementation. This will be about 3 times faster than option (2), so about 6 times faster than option (1) but it's also a lot harder to implement, since you'll need knowledge about ARM assembly programming + NEON instructions.
Load the texture with padding and adjust the texture coordinates for OpenGL to make it ignore the padding pixels. Depending on how complex your texture mapping is, this might be very easy to implement, it's faster than any other option above and the only downside is that you waste a little bit more texture memory on the GPU.

I know very little about ARM assembly programming and even less about NEON instructions, so I cannot really help you with options (2) and (3). I could show you an implementation for option (1), however, I'm afraid it might be too slow for your purpose. This only leaves the last option which I have been using myself plenty of times in the past.

We declare 3 variables: width, height, and stride width.

GLsizei width = 852;
GLsizei height = 640;
GLsizei strideWidth = 864;

When you load the texture data (assuming rawData points to the raw image bytes), you pretend the strideWidth to be the "real width":

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, 
    strideWidth, height, 0, GL_RGB, GL_UNSIGNED_BYTE, rawData);

Texture coordinates in OpenGL are normalized, that means the lower left corner is always (0.0f, 0.0f) and the upper right corner is always (1.0f, 1.0f), regardless what pixel size the texture really has. These two values could be called (x, y), but to not confuse them with vertex coordinates, they are called (s, t) instead.

To make OpenGL cut off the padding pixels, you just need to adjust all s-coordinates by a certain factor, let's call it SPCF (Stride Padding Cut Factor), which you calculate the following way:

float SPCF = (float)width / strideWidth;

So instead of the texture coordinate (0.35f, 0.6f), you would use (0.35f * SPCF, 0.6f). Of course you shouldn't perform this calculation once per rendered frame. Instead you should copy the original texture coordinates, adjust all s-coordinates once by SPCF and then use these adjusted coordinates when rendering frames. If you ever reload the texture in the future and SPCF has changed, repeat the adjustment process. In case width equals strideWidth, this algorithm works as well, as in that case SPCF is 1.0f and thus won't alter the s-coordinates at all, which would be correct, since there is no padding to cut off.

The downside of this trick is that the texture will need 2.4% more memory in your case than would otherwise be necessary, which also means that the texture upload via glTexImage2D will be 2.4% slower. I guess that is acceptable and still much faster than any of the other CPU intensive options above.

score 0 · Answer 2 · answered Jun 30 '11 at 23:22

0

you can use the GL_PACK_ROW_LENGTH state variable in conjunction with GL_PACK_ALIGNMENT to control row lengths of your data. You can look it up in the manpage of e.g. glPixelStorei or even better with some images here: http://www.opengl.org/resources/features/KilgardTechniques/oglpitfall/

I guess it will be something like:

glPixelStorei(GL_PACK_ROW_LENGTH, nPixelsPerRow);
glPixelStorei(GL_PACK_ALIGNMENT, nPadBytes);

Please not that i didnt test the above code. It's merely intended as a hint.

cheers

answered Jun 30 '11 at 23:22

pokey909

1,797
1
16
22

I wish I could use those constants, but OpenGL ES does not support GL_PACK_ROW_LENGTH or GL_PACK_ALIGNMENT – sotangochips Jun 30 '11 at 23:31
damn thats true. Can you use **glTexSubImage2D** ? Oh and apprently you can at least use GL_PACK_ALIGNMENT. But only up to a value of 8 bytes, So its no help either :-( – pokey909 Jun 30 '11 at 23:42
Seem OpenGL ES ver 2.0 is not support GL_PACK_ROW_LENGTH, but ver 3.0 is supported. There is an extension to work with GL_PACK_ROW_LENGTH in OpenGL ES v2.0. For more detail: http://stackoverflow.com/questions/18149967/updating-only-a-horizontal-subregion-of-a-texture-in-opengl-es-2-0 – Nhat Dinh Nov 01 '16 at 01:45

CVImageBuffer comes back with extra column padding. How do I crop it?

2 Answers2