1

Based on the concepts in my question here and the problems with PNG files outlined in my other question here, I'm going to try loading an RGBA image from two jpeg. One will contain the RGB and the other only the Alpha. I can either save the 2nd as a greyscale jpeg or as an RGB and pull the alpha data from the red component.

In a second step I'll save the raw image data out to a file in the cache. I'll then do a test to determine it's faster to load the raw data or to decompress the jpegs and build the raw data. If I determine it's faster, on subsequent loads I could check for the existence of the raw file in the cache. If not I'll skip that file save.

I know how to load the two jpeg into two UIImages. What I'm not sure about is the fastest or most efficient way of interleaving the rgb from the one UIImage with whatever channel of the other UIImage I use for the alpha.

I see two possibilities. One would be in comment B below.. iterate through all the pixels and copy the red from the "alpha jpeg" to the alpha of the imageData steam.

The other is that maybe there's some magic UIImage command to copy a channel from one to a channel of another. If I did that it would be somewhere around comment A.

Any ideas?

EDIT - Also.. the process can't destroy any RGB information. The whole reason I need this process is that PNG's from photoshop premultiply the rgb with the alpha and thus destroy the rgb information. I'm using the alpha for something other than an alpha in a custom openGL shader. So I'm looking for raw RGBA data that I can set alpha to anything to be used as a specular map or an illumination map or height map or something other than alpha.

Here's my starter code minus my error checking and other proprietary crap. I have an array of textures that I use to manage everything about textures:

if (textureInfo[texIndex].generated==NO) {
    glGenTextures(1, &textureInfo[texIndex].texture);
    textureInfo[texIndex].generated=YES;
}

glBindTexture(GL_TEXTURE_2D, textureInfo[texIndex].texture);

// glTexParameteri commands are here based on options for this texture

NSString *path = [[NSBundle mainBundle] pathForResource:[NSString stringWithFormat:@"%@_RGB",name] ofType:type];

NSData *texData = [[NSData alloc] initWithContentsOfFile:path];
UIImage *imageRGB = [[UIImage alloc] initWithData:texData];

path = [[NSBundle mainBundle] pathForResource:[NSString stringWithFormat:@"%@_A",name] ofType:type];

texData = [[NSData alloc] initWithContentsOfFile:path];
UIImage *imageAlpha = [[UIImage alloc] initWithData:texData];

CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
void *imageDataRGB = malloc( heightRGB * widthRGB * 4 );
void *imageDataAlpha = malloc( heightA * widthA * 4 );
CGContextRef thisContextRGB = CGBitmapContextCreate( imageDataRGB, widthRGB, heightRGB, 8, 4 * widthRGB, colorSpace, kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big );
CGContextRef thisContextA = CGBitmapContextCreate( imageDataAlpha, widthA, heightA, 8, 4 * widthA, colorSpace, kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big );

// **** A. In here I want to mix the two.. take the R of imageA and stick it in the Alpha of imageRGB.

CGColorSpaceRelease( colorSpace );
CGContextClearRect( thisContextRGB, CGRectMake( 0, 0, widthRGB, heightRGB ) );
CGContextDrawImage( thisContextRGB, CGRectMake( 0, 0, widthRGB, heightRGB ), imageRGB.CGImage );

// **** B. OR maybe repeat the above 3 lines for the imageA.CGImage and then
// **** It could be done in here by iterating through the data and copying the R byte of imageDataA on to the A byte of the imageDataRGB

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, widthRGB, heightRGB, 0, GL_RGBA, GL_UNSIGNED_BYTE, imageDataRGB);

// **** In here I could save off the merged imageData to a binary file and load that later if it's faster

glBindTexture(GL_TEXTURE_2D, textureInfo[texIndex].texture);

// Generates a full MipMap chain to the current bound texture.
if (useMipmap) {
    glGenerateMipmap(GL_TEXTURE_2D);
}

CGContextRelease(thisContextRGB);
CGContextRelease(thisContextA);

free(imageDataRGB);
free(imageDataAlpha);
Community
  • 1
  • 1
badweasel
  • 2,349
  • 1
  • 19
  • 31
  • 2
    It may be faster to make 2 textures out of the RGB and grayscale images, and do the merge with a shader. You might also check out the Accelerate.framework. I've found it has some useful swizzle routines when working on the CPU, and it might have something to pull one channel from one image and the other channels from another image. (I'm not positive - just speculating.) – user1118321 Mar 08 '14 at 02:00
  • No way it's faster. If you look at http://stackoverflow.com/questions/22110519/most-efficient-way-of-multi-texturing-ios-opengl-es2-optimization right now I'm doing the 4way image. But that requires twice as many texture2D calls than I really need since each call can get me 4 floats. I only need 8 floats to render a fragment with the 4 texture maps. Especially on older devices like the iPhone4, halfing the number of texture2D calls should speed render times up a lot. – badweasel Mar 08 '14 at 02:16
  • Fair enough. Looking at the Accelerate.framework docs, it looks like [`vImageSelectChannels_ARGB8888`](https://developer.apple.com/library/ios/documentation/Performance/Reference/vImage_conversion/Reference/reference.html#//apple_ref/doc/uid/TP40005488-CH210-SW1) does what you want. You can even operate in-place, saving some memory. Or, you could use TIFF files to store straight alpha and store RGBA rather than png files or multiple jpgs, if that would be easier. – user1118321 Mar 08 '14 at 03:27
  • It would be easier but when I go to save a TIFF in Photoshop alpha is greyed out. I'll mess around and see if there is a way because certainly it would be easier to not mess with all of this. – badweasel Mar 08 '14 at 05:20
  • no.. the TIFF is premultiplying the alpha. So the rgb is destroyed. It must have to do with something Xcode or the iPhone is doing to it. Cause I can load the tiff back into PSD and all the channels are there. OR something in my code above. – badweasel Mar 08 '14 at 05:42
  • There's also this: http://stackoverflow.com/questions/4012035/opengl-es-iphone-alpha-blending-looks-weird?rq=1 and this: http://www.cocos2d-iphone.org/forums/topic/png-loading-premultiplied-alpha-and-the-simulator-again/#post-269678 – badweasel Mar 08 '14 at 05:56
  • And this http://stackoverflow.com/questions/13336241/loading-4-channel-texture-data-in-ios further makes me think that its difficult to get a single image in with RGBA without it being premultiplied. I'm back to my original question here of how to merge two images to do it. – badweasel Mar 08 '14 at 06:06
  • @user1118321 I recant my previous statement of "no way it's faster". I actually don't care that much about how long it takes to load the images within reason. Either way it's fast enough. It's the drawing fps that I wanted to improve. But you might be right that it might be faster or might not make a difference. Now that I've tried it. I can't figure out optimization for the older iPhone 4 and how anyone gets reasonable frame rates on it. Thanks for your input! – badweasel Mar 08 '14 at 21:16

2 Answers2

1

I second the suggestion made in a comment above to use separate textures and combine in a shader. I would however explain why that is likely to be faster...

The number of texture2D calls itself should not have much to do with speed. The important factors affecting speed are: (1) how much data needs to be copied from CPU to GPU? (you can test easily, calling texture2D twice with N/2 pixels in each call is almost exactly as fast as calling it once with N pixels everything else being equal), and (2) whether the implementation needs to rearrange the data in CPU memory before texture2D (if yes, then the call can be extremely slow). Some texture formats need a rearrange, some don't; usually at least RGBA, RGB565 and some variant of YUV420 or YUYV do not need a rearrange. The stride and width/height being a power of two may also be important.

I think that, if there is no need to rearrange data, one call with RGB and one call with A will be approximately as fast as a call with RGBA.

Since the rearrange is much slower than the copy, it would probably even be faster to copy RGBX (ignoring the fourth channel) and then A, rather than rearrange RGB and A on the CPU and then copy.

P.S. "In a second step I'll save the raw image data out to a file in the cache. I'll then do a test to determine it's faster to load the raw data or to decompress the jpegs and build the raw data." - reading raw data from anything other than memory is likely to be a lot slower than decompress. 1 megapixel image takes a few tens of milliseconds to decompress from jpeg, or hundreds of milliseconds to read raw data from flash drive.

Alex I
  • 19,689
  • 9
  • 86
  • 158
  • I agree on your second point that loading raw is likely slower and will skip that step. But I am about to test if my fixed rgba with 2 texture2d calls is faster than my older method of calling texture2d 4 times. As I have answered my own question on how to combine 2 jpegs. What I don't understand in your answer is the part about "n/2 pixels in each call". The size of my texture mapping isn't changing. There's still the same number of pixels per fragment. It's just that before I was unable to get a clean rgb if I included another map encoded as the alpha. Maybe you can explain. – badweasel Mar 08 '14 at 07:50
  • Unless you're saying to create an luminance only texture as my other texture and call one getting rgb only and the other getting alpha only. But according to the openGL ES 2.0 reference all texture2d calls return a vec4 - so I don't see how that would be any better. – badweasel Mar 08 '14 at 08:08
  • Sorry.. I'm really curious about a few things you mention in your answer. Like the data copy from CPU to GPU. Aren't I only doing this once when I load up the texture? The texture2d call I was referring to was the one in my shader. Once I've generated the texture isn't it then in the gpu? My faster that I'm talking about is the shader speed not the loading textures speed. – badweasel Mar 08 '14 at 08:35
  • 1
    @badweasel: "Unless you're saying to create an luminance only texture as my other texture and call one getting rgb only and the other getting alpha only." - Yes, that's it. Copying over a luminance-only texture from CPU to GPU is generally about 4x faster than RGBA, because there is 1/4 as much data. Assuming there is no rearrangement, of course. – Alex I Mar 08 '14 at 10:23
  • 1
    @badweasel: "all texture2d calls return a vec4" - Yes, but that is not actually what happens under the hood. If you access a texture that is luminance-only, sure you get a value which you can use as a vec4, but the last 3 elements of that are just zero constants, they are not read from the texture. If you just do math with the first element and never reference the rest, there is no performance penalty at all. – Alex I Mar 08 '14 at 10:30
  • Thanks for letting me know that. On the iPhone 5 it's all screaming fast. 60fps all day. But on the iPhone 4S its freaking slow. Was getting 9-16. I can only imagine what it would be on the 4. When I switched to the new textures and new shader it made almost no difference. And I was doing less texture binding cause I had combined textures. I finally narrowed it down to drawing the background (no texture) which when removed upped my rates by 2x - 3x on the 4S. – badweasel Mar 08 '14 at 19:56
  • which do you think is better: A) bind 4x 512x512 textures for my 4 maps and use 4 texture2D calls each a different texture sampler, OR B) use a 1024x1024 4-way texture as described here http://stackoverflow.com/questions/22110519/most-efficient-way-of-multi-texturing-ios-opengl-es2-optimization?lq=1 and doing textureOffset and textureScaling math in the vertex shader to reach each quadrant? – badweasel Mar 08 '14 at 21:14
  • 1
    @badweasel: It is hard to say; this will be hardware-dependent. I wouldn't worry too much about using 4 textures, that seems to be what OpenGL wants you to do anyway (but make the alpha textures luminance-only, of course). The one case in which you would have big performance difference is if your single 4-way texture has some areas in which you only need the alpha, but you are reading all 4 channels; that would be significantly slower. I would prefer method 1 in your other question. – Alex I Mar 09 '14 at 10:54
  • Thanks. For this particular game, I was trying to hammer it out in a couple of weeks. So I'm sticking with the 4 way method for now simply because all the textures are build and coding is done. But for my overall engine I'm starting to think that individual textures might be faster in the end. And it's a great tip to make them luminance only. – badweasel Mar 10 '14 at 04:45
1

It is a pretty simple copying of the alpha over to the version that has the clean rgb. Despite the debate as to whether or not my shader is faster calling texture2D 2 or 4 times per fragment, the method below worked as a way to get an un-premultiplied RGBA into my glTexImage2D(GL_TEXTURE_2D....

This is my entire method:

-(BOOL)quickLoadTexPartsToNumber:(int)texIndex imageNameBase:(NSString *)name ofType:(NSString *)type  flipImage:(bool)flipImage clamp:(bool)clamp  mipmap:(bool)useMipmap
{
    //NSLog(@"loading image: %@ into %i",name, texIndex);

    // generate a new texture for that index number..  if it hasn't already been done
    if (textureInfo[texIndex].generated==NO) {
        glGenTextures(1, &textureInfo[texIndex].texture);
        textureInfo[texIndex].generated=YES;
    }

    glBindTexture(GL_TEXTURE_2D, textureInfo[texIndex].texture);

    if (useMipmap) {
        if (clamp) {
            glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MIN_FILTER,GL_LINEAR_MIPMAP_NEAREST);
            glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MAG_FILTER,GL_LINEAR_MIPMAP_NEAREST);

            glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE );
            glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE );
        }

        if (hardwareLimitions==HARDWARE_LIMITS_NONE) {
            glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MIN_FILTER,GL_LINEAR_MIPMAP_LINEAR);
            glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MAG_FILTER,GL_LINEAR_MIPMAP_LINEAR);
        }
        else {
            glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MIN_FILTER,GL_NEAREST_MIPMAP_LINEAR);
            glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MAG_FILTER,GL_NEAREST_MIPMAP_LINEAR);
        }
    }
    else {
        if (clamp) {
            glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MIN_FILTER,GL_NEAREST);
            glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MAG_FILTER,GL_NEAREST);

            glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE );
            glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE );
        }

        glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MIN_FILTER,GL_LINEAR);
        glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MAG_FILTER,GL_LINEAR);
    }

    NSString *path = [[NSBundle mainBundle] pathForResource:[NSString stringWithFormat:@"%@_RGB",name] ofType:type];

    NSData *texData = [[NSData alloc] initWithContentsOfFile:path];
    UIImage *imageRGB = [[UIImage alloc] initWithData:texData];

    path = [[NSBundle mainBundle] pathForResource:[NSString stringWithFormat:@"%@_A",name] ofType:type];

    texData = [[NSData alloc] initWithContentsOfFile:path];
    UIImage *imageAlpha = [[UIImage alloc] initWithData:texData];


    if (imageRGB == nil) {
        NSLog(@"************* Image %@ is nil - there's a problem", [NSString stringWithFormat:@"%@_RGB",name]);
        return NO;
    }
    if (imageAlpha == nil) {
        NSLog(@"************* Image %@ is nil - there's a problem", [NSString stringWithFormat:@"%@_A",name]);
        return NO;
    }


    GLuint widthRGB = CGImageGetWidth(imageRGB.CGImage);
    GLuint heightRGB = CGImageGetHeight(imageRGB.CGImage);

    GLuint widthAlpha = CGImageGetWidth(imageAlpha.CGImage);
    GLuint heightAlpha = CGImageGetHeight(imageAlpha.CGImage);

    if (widthRGB != widthAlpha || heightRGB!=heightAlpha) {
        NSLog(@"************* Image %@ - RBG and Alpha sizes don't match", name);
    }

    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
    // unsigned char is an 8 bit unsigned integer
    unsigned char *imageDataRGB = malloc( heightRGB * widthRGB * 4 );
    unsigned char *imageDataAlpha = malloc( heightAlpha * widthAlpha * 4 );
    CGContextRef thisContextRGB = CGBitmapContextCreate( imageDataRGB, widthRGB, heightRGB, 8, 4 * widthRGB, colorSpace, kCGImageAlphaNoneSkipLast | kCGBitmapByteOrder32Big );
    CGContextRef thisContextAlpha = CGBitmapContextCreate( imageDataAlpha, widthAlpha, heightAlpha, 8, 4 * widthAlpha, colorSpace, kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big );

    if (flipImage)
    {
        // Flip the Y-axis  - don't want to do this because in this game I made all my vertex squares upside down.
        CGContextTranslateCTM (thisContextRGB, 0, heightRGB);
        CGContextScaleCTM (thisContextRGB, 1.0, -1.0);

        CGContextTranslateCTM (thisContextAlpha, 0, heightAlpha);
        CGContextScaleCTM (thisContextAlpha, 1.0, -1.0);
    }

    CGColorSpaceRelease( colorSpace );
    CGContextClearRect( thisContextRGB, CGRectMake( 0, 0, widthRGB, heightRGB ) );
    // draw the RGB version and skip the alpha
    CGContextDrawImage( thisContextRGB, CGRectMake( 0, 0, widthRGB, heightRGB ), imageRGB.CGImage );

    CGColorSpaceRelease( colorSpace );
    CGContextClearRect( thisContextAlpha, CGRectMake( 0, 0, widthAlpha, heightAlpha ) );
    CGContextDrawImage( thisContextAlpha, CGRectMake( 0, 0, widthAlpha, heightAlpha ), imageAlpha.CGImage );

    int count = 4 * widthRGB * heightRGB;
    for(int i=0; i < count; i+=4)
    {
        // copying the alpha (one byte) on to a non-premultiplied rgb is faster than copying the rgb over (3 bytes)
       imageDataRGB[i+3] = imageDataAlpha[i];
    }

    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, widthRGB, heightRGB, 0, GL_RGBA, GL_UNSIGNED_BYTE, imageDataRGB);

    glBindTexture(GL_TEXTURE_2D, textureInfo[texIndex].texture);

    // Generates a full MipMap chain to the current bound texture.
    if (useMipmap) {
        glGenerateMipmap(GL_TEXTURE_2D);
    }

    CGContextRelease(thisContextRGB);
    CGContextRelease(thisContextAlpha);

    free(imageDataRGB);
    free(imageDataAlpha);

    return YES;
}

I did try using a TIF instead of a PNG and no matter what I tried somewhere in the process the rgb was getting premultiplies with the alpha thus destroying the rgb.

This method might be considered ugly but it works on many levels for me and is the only way I've been able to get full RGBA8888 un-premultiplied images into OpenGL.

badweasel
  • 2,349
  • 1
  • 19
  • 31
  • This was the answer I was looking for. This answers the actual question of how to do it. Whether or not it should be done is another issue and I up voted Alex for that. – badweasel Mar 11 '14 at 20:53