0

I'm in the process of writing a little game to teach myself OpenGL rendering as it's one of the things I haven't tackled yet. I used SDL before and this same function, while still performing badly, didn't go as over the top as it does now.

Basically, there is not much going on in my game yet, just some basic movement and background drawing. When I switched to OpenGL, it appears as if it's way too fast. My frames per second exceed 2000 and this function uses up most of the processing power.

What is interesting is that the program in it's SDL version used 100% CPU but ran smoothly, while the OpenGL version uses only about 40% - 60% CPU but seems to tax my graphics card in such a way that my whole desktop becomes unresponsive. Bad.

It's not a too complex function, it renders a 1024x1024 background tile according to the player's X and Y coordinates to give the impression of movement while the player graphic itself stays locked in the center. Because it's a small tile for a bigger screen, I have to render it multiple times to stitch the tiles together for a full background. The two for loops in the code below iterate 12 times, combined, so I can see why this is ineffective when called 2000 times per second.

So to get to the point, this is the evil-doer:

void render_background(game_t *game)
{
    int bgw;
    int bgh;

    int x, y;

    glBindTexture(GL_TEXTURE_2D, game->art_background);
    glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH,  &bgw);
    glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT, &bgh);

    glBegin(GL_QUADS);

    /*
     * Start one background tile too early and end one too late
     * so the player can not outrun the background
     */
    for (x = -bgw; x < root->w + bgw; x += bgw)
    {
        for (y = -bgh; y < root->h + bgh; y += bgh)
        {
            /* Offsets */
            int ox = x + (int)game->player->x % bgw;
            int oy = y + (int)game->player->y % bgh;

            /* Top Left */
            glTexCoord2f(0, 0);
            glVertex3f(ox, oy, 0);

            /* Top Right */
            glTexCoord2f(1, 0);
            glVertex3f(ox + bgw, oy, 0);

            /* Bottom Right */
            glTexCoord2f(1, 1);
            glVertex3f(ox + bgw, oy + bgh, 0);

            /* Bottom Left */
            glTexCoord2f(0, 1);
            glVertex3f(ox, oy + bgh, 0);
        }
    }

    glEnd();
}

If I artificially limit the speed by called SDL_Delay(1) in the game loop, I cut the FPS down to ~660 ± 20, I get no "performance overkill". But I doubt that is the correct way to go on about this.

For the sake of completion, these are my general rendering and game loop functions:

void game_main()
{
    long current_ticks = 0;
    long elapsed_ticks;
    long last_ticks = SDL_GetTicks();

    game_t game;
    object_t player;

    if (init_game(&game) != 0)
        return;

    init_player(&player);
    game.player = &player;

    /* game_init() */
    while (!game.quit)
    {
        /* Update number of ticks since last loop */
        current_ticks = SDL_GetTicks();
        elapsed_ticks = current_ticks - last_ticks;

        last_ticks = current_ticks;

        game_handle_inputs(elapsed_ticks, &game);
        game_update(elapsed_ticks, &game);

        game_render(elapsed_ticks, &game);

        /* Lagging stops if I enable this */
        /* SDL_Delay(1); */
    }

    cleanup_game(&game);


    return;
}

void game_render(long elapsed_ticks, game_t *game)
{
    game->tick_counter += elapsed_ticks;

    if (game->tick_counter >= 1000)
    {
        game->fps = game->frame_counter;
        game->tick_counter = 0;
        game->frame_counter = 0;

        printf("FPS: %d\n", game->fps);
    }

    render_background(game);
    render_objects(game);

    SDL_GL_SwapBuffers();
    game->frame_counter++;

    return;
}

According to gprof profiling, even when I limit the execution with SDL_Delay(), it still spends about 50% of the time rendering my background.

LukeN
  • 5,590
  • 1
  • 25
  • 33
  • 5
    Create a timer to limit the frame-rate ... – Jason Aug 28 '11 at 15:10
  • Unless I'm missing something can't you write this whole function as a [glCallList](http://www.opengl.org/sdk/docs/man/xhtml/glCallList.xml) and then use use on call to [glTranslatef](http://www.opengl.org/sdk/docs/man/xhtml/glTranslate.xml) to control where it gets drawn? – Flexo Aug 28 '11 at 15:12
  • 1
    Idle processing is bad thing: having better performance you actually create new problems. Use multimedia timer instead of idle processing. – Alex F Aug 28 '11 at 15:15
  • gprof will only tell you "self-time" of CPU-bound code, plus it tries to deduce inclusive time of higher-level code, by a very questionable method. [Don't expect it to tell you much at all.](http://stackoverflow.com/questions/1777556/alternatives-to-gprof/1779343#1779343) If you're on linux, really, try [Zoom](http://www.rotateright.com/). – Mike Dunlavey Aug 29 '11 at 01:40

3 Answers3

6

Turn on VSYNC. That way you'll calculate graphics data exactly as fast as the display can present it to the user, and you won't waste CPU or GPU cycles calculating extra frames inbetween that will just be discarded because the monitor is still busy displaying a previous frame.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • As far as I remember, VSYNC only works in fullscreen which I am not using here.. is my memory lying to me? – LukeN Aug 28 '11 at 15:33
  • @LukeN: That sort of limitation might be platform-specific. In any case, I encourage you to just try it. – Ben Voigt Aug 28 '11 at 15:34
  • @LukeN: VSync is independent of fullscreen or not. – datenwolf Aug 28 '11 at 19:06
  • First of all, V-Sync framecap set via the driver is system wide, which you may not want. Next, it isn't always available on other OS. If you want to do it programmatically, you are pretty much limited to Windows as well. So this is not a good idea imo. – Razzupaltuff Aug 29 '11 at 09:19
  • @karx: I'm not recommending a driver-level solution. Both the wgl and glx APIs provide functions for controlling vsync per-context. – Ben Voigt Aug 29 '11 at 13:35
  • I have never read about these. Do you have a link? I have one: http://stackoverflow.com/questions/589064/how-to-enable-vertical-sync-in-opengl. According to what has been stated there, you are wrong. – Razzupaltuff Aug 29 '11 at 17:32
  • I have never seen that before - and I could have needed it. Now I wonder how much you can rely on the presence of that extension on Linux boxes. An in-game solution will always work. – Razzupaltuff Aug 29 '11 at 23:08
3

First of all, you don't need to render the tile x*y times - you can render it once for the entire area it should cover and use GL_REPEAT to have OpenGL cover the entire area with it. All you need to do is to compute the proper texture coordinates once, so that the tile doesn't get distorted (stretched). To make it appear to be moving, increase the texture coordinates by a small margin every frame.

Now down to limiting the speed. What you want to do is not to just plug a sleep() call in there, but measure the time it takes to render one complete frame:

function FrameCap (time_t desiredFrameTime, time_t actualFrameTime)
{
   time_t delay = 1000 / desiredFrameTime;
   if (desiredFrameTime > actualFrameTime)
      sleep (desiredFrameTime - actualFrameTime); // there is a small imprecision here
}

time_t startTime = (time_t) SDL_GetTicks ();
// render frame
FrameCap ((time_t) SDL_GetTicks () - startTime);

There are ways to make this more precise (e.g. by using the performance counter functions on Windows 7, or using microsecond resolution on Linux), but I think you get the general idea. This approach also has the advantage of being driver independent and - unlike coupling to V-Sync - allowing an arbitrary frame rate.

Razzupaltuff
  • 2,250
  • 2
  • 21
  • 37
  • Arbitrary frame rate is useless. The only rate cap that makes sense is the one that matches the user's display. How do you propose determining whether the user's display system is running at 60 Hz, 72 Hz, 75 Hz, 85 Hz (or even 100 Hz or 120 Hz on high-end hardware)? – Ben Voigt Aug 29 '11 at 13:37
  • You have never played a shooter game, do you? Your comment is nonsense. A lot more things can be tied to the framerate than just the display update (e.g. physics or AI updates). – Razzupaltuff Aug 29 '11 at 17:31
  • There may be benefit to simulating motion, hit testing, reading mouse input, etc more often than the render rate. But there's never any advantage to rendering more often than the display system can accept. (Assuming this is for real-time playback, if you're generating a video file to be played later, then render as fast as possible. But you still render for a playback rate corresponding to the display.) – Ben Voigt Aug 29 '11 at 20:15
  • Unless you put that stuff in separate threads, you have to tune it via some in-game framecap. – Razzupaltuff Aug 29 '11 at 23:06
1

At 2000 FPS it only takes 0.5 ms to render the entire frame. If you want to get 60 FPS then each frame should take about 16 ms. To do this, first render your frame (about 0.5 ms), then use SDL_Delay() to use up the rest of the 16 ms.

Also, if you are interested in profiling your code (which isn't needed if you are getting 2000 FPS!) then you may want to use High Resolution Timers. That way you could tell exactly how long any block of code takes, not just how much time your program spends in it.

fintelia
  • 1,201
  • 6
  • 17