1

I have an animated background layer that is built on top of OpenGL ES 2.0 and I am using GLKView as graphics container and GLKViewController as controller. For drawing I use GLKBaseEffect.

I introduced a sprite class that can load png-files as textures, manipulate the sprite (SRT) and some additional properties like alpha blending and so on.

I am wondering how I could optimise my program, since the frame-rate drops on my iPhone 4S to about 25 FPS when displaying 50 sprites (all with the same texture/png-file!) with a size of 128x128 px each.

In the following sections I have listed the important parts of the program. Currently I call glDrawArrays(GL_TRIANGLE_STRIP, 0, 4) for each of the 50 sprites for each of the frames (target frame rate is 60); which is equally to 3000 calls per second.

Could that be the bottle neck ? How could I optimise this ?

This is how I initialise the sprite array (GLKViewController.m):

- (void)initParticles {
    if(sprites==nil) {
        sprites = [NSMutableArray array];
        for (int i=0; i<50; i++) {
            Sprite* sprite = [[Sprite alloc] initWithFile:@"bubble" extension:@"png" effect:effect];
            // configure some sprite properties [abbreviated]
            [sprites addObject:sprite];
        }
    }
}

This is the rendering function (GLKViewController.m):

- (void)glkView:(GLKView *)view drawInRect:(CGRect)rect {
    glClearColor(0.0, 0.0, 1.0, 1.0);
    glClear(GL_COLOR_BUFFER_BIT);
    // render the bubbles
    for (Sprite* sprite in sprites) {
        [sprite render];
    }
}

Here are some important parts of the sprite class (Sprite.m):

- (id)initWithFile:(NSString *)filename extension:(NSString*)extension effect:(GLKBaseEffect *)effect {
    if(self = [self init]) {
        self.effect = effect;
        NSDictionary* options = [NSDictionary dictionaryWithObjectsAndKeys:[NSNumber numberWithBool:YES], GLKTextureLoaderOriginBottomLeft, nil];
        NSError* error = nil;
        NSString *path = [[NSBundle mainBundle] pathForResource:filename ofType:nil];
        self.textureInfo = [GLKTextureLoader textureWithContentsOfFile:path options:options error:&error];
        if (self.textureInfo == nil) {
            NSLog(@"Error loading file: %@", [error localizedDescription]);
            return nil;
        }
        TexturedQuad newQuad;
        newQuad.bl.geometryVertex = GLKVector2Make(0, 0);
        newQuad.br.geometryVertex = GLKVector2Make(self.textureInfo.width, 0);
        newQuad.tl.geometryVertex = GLKVector2Make(0, self.textureInfo.height);
        newQuad.tr.geometryVertex = GLKVector2Make(self.textureInfo.width, self.textureInfo.height);
        newQuad.bl.textureVertex = GLKVector2Make(0, 0);
        newQuad.br.textureVertex = GLKVector2Make(1, 0);
        newQuad.tl.textureVertex = GLKVector2Make(0, 1);
        newQuad.tr.textureVertex = GLKVector2Make(1, 1);
        self.quad = newQuad;
    }
    return self;
}

- (void)render {
    [self applyBaseEffect];
    long offset = (long)&_quad;
    glEnableVertexAttribArray(GLKVertexAttribPosition);
    glEnableVertexAttribArray(GLKVertexAttribTexCoord0);
    glVertexAttribPointer(GLKVertexAttribPosition, 2, GL_FLOAT, GL_FALSE, sizeof(TexturedVertex), (void*) (offset + offsetof(TexturedVertex, geometryVertex)));
    glVertexAttribPointer(GLKVertexAttribTexCoord0, 2, GL_FLOAT, GL_FALSE, sizeof(TexturedVertex), (void*) (offset + offsetof(TexturedVertex, textureVertex)));
    glBlendColor(1.0, 1.0, 1.0, self.alpha);
    glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
    glDisableVertexAttribArray(GLKVertexAttribPosition);
    glDisableVertexAttribArray(GLKVertexAttribTexCoord0);
}

- (void)applyBaseEffect {
    self.effect.texture2d0.name = self.textureInfo.name;
    self.effect.texture2d0.envMode = GLKTextureEnvModeModulate;
    self.effect.texture2d0.target = GLKTextureTarget2D;
    self.effect.texture2d0.enabled = GL_TRUE;
    self.effect.useConstantColor = GL_TRUE;
    self.effect.constantColor = GLKVector4Make(self.tint.r*self.alpha, self.tint.g*self.alpha, self.tint.b*self.alpha, self.alpha);
    self.effect.transform.modelviewMatrix = GLKMatrix4Multiply(GLKMatrix4Identity, [self modelMatrix]);
    [self.effect prepareToDraw];
}

- (GLKMatrix4)modelMatrix {
    GLKMatrix4 modelMatrix = GLKMatrix4Identity;
    modelMatrix = GLKMatrix4Translate(modelMatrix, self.position.x, self.position.y, 0);
    modelMatrix = GLKMatrix4Rotate(modelMatrix, self.rotation, 0, 0, 1);
    modelMatrix = GLKMatrix4Scale(modelMatrix, self.scale, self.scale, 0);
    modelMatrix = GLKMatrix4Translate(modelMatrix, -self.normalSize.width/2, -self.normalSize.height/2, 0);
    return modelMatrix;
}

EDIT-1: Here are some performance indicators (seems to be GPU-bound)

enter image description here

EDIT-2: The frame rate improved from 25 to 45 on my iPhone 4S when I added the view.drawableMultisample line regardless of which line I used (none / 4x). Strange - my rendering code seems not to be affected by MSAA, rather on the contrary.

GLKView *view = (GLKView*)self.view;
view.context = context;
view.drawableMultisample = GLKViewDrawableMultisampleNone;
//view.drawableMultisample = GLKViewDrawableMultisample4x;
genpfault
  • 51,148
  • 11
  • 85
  • 139
salocinx
  • 3,715
  • 8
  • 61
  • 110
  • Have sprites same characteristics (vertex, texture, etc)? – xcesco Aug 14 '15 at 11:03
  • Yes they are all the same (some bubbles moving from the bottom to the top). I only manipulate them with `effect.transform.modelviewMatrix`. – salocinx Aug 14 '15 at 11:17
  • 1
    You need to profile first in order to know if your app is CPU or GPU bound. Please check [this](https://developer.apple.com/library/ios/documentation/3DDrawing/Conceptual/OpenGLES_ProgrammingGuide/ToolsOverview/ToolsOverview.html) then run your app from XCode and add a screenshot of the result here. I'd like to see the values for "Utilization" and "Frame Time" – VB_overflow Aug 14 '15 at 12:05
  • @VB_overflow: Okay I added the performance measurement. Seems to be GPU bound, right? – salocinx Aug 14 '15 at 12:44
  • 1
    Yes you are GPU bound, but more precisely GPU bound by the fragment shaders / rasterization part. I will try to write a detailled answer later. But basically this means that your fragment shaders are too complex and/or you have too much screen surface occupied by quads using alpha blending (alpha blending with a lot of overdraw is a performance killer on iPhone4 / 4S, alpha blended quads cost a lot more than opaque ones). Also did you enabled MSAA ? This is not advised to use MSAA if you target 60fps on low end IOS devices. – VB_overflow Aug 14 '15 at 13:08
  • @VB_overflow: No I didn't use MSAA to my knowledge (I didn't set it at all). But I now inserted `view.drawableMultisample = GLKViewDrawableMultisampleNone` respective `GLKViewDrawableMultisample4x` for testing purposes and the frame rate improved from 25 to 45 in both cases. I updated my question (EDIT-2). Regarding the fragment shaders: I only use the linear+radial gradient shader (from the other question) combined with the 50 sprites which are drawn by the GLKBaseEffect. I can't do it without alpha blending, but in case of necessity I reduce the amount of sprites for low-end devices. – salocinx Aug 14 '15 at 13:40
  • 1
    I do not understand your MSAA results, are you saying that GLKViewDrawableMultisampleNone gives you 45 fps while GLKViewDrawableMultisample4x gives you 25 fps ? Or that just the fact of setting a value to view.drawableMultisample is improving your fps (this seems highly improbable ...) ? – VB_overflow Aug 14 '15 at 14:11
  • 1
    Also, I think the problem is mainly coming from your background. Its shader is too heavy for low end IOS (I know it does not much, but this is already too much ...). Since this is only a "static background" why not just using a premade texture instead of a shader ? Note that by using Frame Buffer Objects you could "render to a texture" the background using the shader then use this texture as background (so the texture would be generated by code ...). Last thing: please make sure that alpha blending is disabled when you render the background (it does not need it I think) – VB_overflow Aug 14 '15 at 14:29
  • @VB_overflow: Yes you are right, this would be highly improbable. I know recognised, that it heavily depends on which view controller I stay. There are some with 2 additional UIView's and some with more. The more, the slower the animated background. I now switch alpha blending "off" for rendering the gradient background and "on" for the foreground sprites within the draw function. Moreover, in view controllers that contain a UIScrollView, the background completely stops moving during scrolling :-/ I like your idea of pre-rendering the background. I will try to accomplish this first. – salocinx Aug 14 '15 at 14:58
  • Btw: Disabling the fragment shader that paints the linear+gradient background does not change the FPS much (+/- 1 FPS). Moreover, my UIButtons use also alpha-blending combined with a drop-shadow... (in order to mix nicely with the animated background). Maybe this is just too much for an iPhone 4S ;-) Probably I will have to reconsider my visual design. Thank you for your help so far!! – salocinx Aug 14 '15 at 15:05
  • @VB_overflow: As you suggested, it is probably the best to analyse the app with a sophisticated profiler tool. I will do that tomorrow and give you some feedback. – salocinx Aug 14 '15 at 15:22

3 Answers3

2

To improve performance, you have to avoid redundant code. Probably you set shader program for each sprite. But always you use the same shader to draw sprites, so you can avoid to set it every time. You can use this kind of optimization for vertex and texture coordinates. For each sprite just calculate matrix MVP (Model-View-Projection).

The method 'drawFrame' in pseudo code:

...
[set sprite program shader]
[set vertex coordinates]
[set texture coordinates]
[set textures ]
for each sprite
   [calculate matrix MVP]
   [draw sprite]
end for
...

Another optimization is to use Vertex Buffer Objects, but in you case i think it's lower priority than sprite batch optimization.

I work on Android, so i can not help you with example code. Nevertheless, I suggest you to have a look to Libgdx to check how they implements sprite batch.

xcesco
  • 4,690
  • 4
  • 34
  • 65
  • Many thanks for your suggestions. Meanwhile we found out, that the app is GPU bound. I think that means, that probably VBO/VBA could help to improve the performance rather changing the code running on the CPU. – salocinx Aug 14 '15 at 13:44
  • 1
    Sure, but you will save a lot of time if you reduce changes to OpenGL state machine. I had same problem on android opengl wallpaper and i got a good performance improvement. If you can, try it. – xcesco Aug 14 '15 at 14:04
  • Okay thanks for your suggestion, I will try it asap and give some feedback here. Thank you very much! – salocinx Aug 14 '15 at 15:00
  • I think VBOs would be a huge win too — as is, the driver doesn't have enough information to know that it doesn't need to transfer your vertices from CPU address space and format to GPU address space and format upon every `render`. So that's both a state issue (which, generally speaking, are troublesome because they imply synchronisation costs) and a bus issue (which is explicit synchronisation). I might actually be tempted to look at that first. – Tommy Aug 14 '15 at 15:46
1

Your performance counters seem to indicate you are GPU fill-rate bound. You could check that your bubble.png asset is making good use of its texture space, unnecessary overdraw might be part of your problem.

You could open it in photoshop/gimp and autocrop the image, if it removes any texels from the edges, then you had some wasted space. If you can eliminate the wasted space you can reduce overdraw.

You mention that the bubbles are 128x128, but it isn't clear whether that's the texture size or the on-screen size. Reducing the on-screen size will improve performance.

Beyond that, you probably need to dig into your Sprite class and look at what the fragment shader is actually doing. Choosing more efficient texture formats and enabling mipmapping might also help.

Columbo
  • 6,648
  • 4
  • 19
  • 30
0

After another two hours intensive research I finally found the cause for the massive performance hit. As described in a comment, the significance of the performance drop grew with increasing buttons that were contained within a view controller. Since I am decorating my buttons with rounded corners and a drop shadows, I looked after that... Here's a part of my original category on the UIView class for the purpose of decorating UI elements:

#import "UIView+Extension.h"

@implementation UIView (Extension)

- (void)styleViewWithRoundedEdges:(BOOL)rounded shadowed:(BOOL)shadowed {
    if (rounded) {
        self.layer.cornerRadius = 3.0;
    }
    if (shadowed) {
        self.layer.shadowColor = [UIColor blackColor].CGColor;
        self.layer.shadowOffset = CGSizeMake(2.0, 2.0);
        self.layer.shadowOpacity = 0.25;
        self.layer.shadowRadius = 1.0;
    }
}

@end

As soon as I switched off the drop shadow, the frame rate turned into constant 60 FPS.

A quick search directed me to this SO thread.

Adding these two lines solved the problem:

self.layer.shouldRasterize = YES;
self.layer.rasterizationScale = UIScreen.mainScreen.scale;

I now render easily 100 sprites combined with the linear+radial background gradient shader on a iPhone 4S with 60 FPS :-D.

But thank you guys so much for your patience and help! Let me know when I can return the favour!

Community
  • 1
  • 1
salocinx
  • 3,715
  • 8
  • 61
  • 110