2

My ultimate goal is to render 1 million spheres of different sizes and colors at 60 fps. I want to be able to move the camera around the screen as well.

I have modified the code on this page of the tutorial I am studying to try to instance 1 million cubes. I have been able to instance up to 90,000 cubes, but if I try to instance 160,000 cubes then the program breaks. I get an error that the program has "stopped working" and unexpectedly quit. I don't know what kind of error this is, but I believe that it may be memory related.

My understanding of instancing is naive, so I do not know what the problem is. I believe that instancing 1 million cubes is the next step to my goal of instancing 1 million spheres. So, my question: How do I instance 1 million cubes/objects in OpenGL?

I have been learning OpenGL through this tutorial and so I use 32-bit GLEW and 32-bit GLFW in Visual Studio 2013. I have 8 GB of RAM on a 64-bit operating system (Windows 7) with an 2.30 GHz CPU.

My code is below:

(set line #2 to be the number of cubes to be instanced. Make sure line#2 has a whole-number square root)

// Make sure NUM_INS is a square number
#define NUM_INS 9

// GLEW
#define GLEW_STATIC
#include <GL/glew.h>

// GLFW
#include <GLFW/glfw3.h>

// GL includes
#include "Shader.h"

// GLM Mathemtics
#include <glm/glm.hpp>
#include <glm/gtc/matrix_transform.hpp>
#include <glm/gtc/type_ptr.hpp>

// Properties
GLuint screenWidth = 800, screenHeight = 600;

// Function prototypes
void key_callback(GLFWwindow* window, int key, int scancode, int action, int mode);


// The MAIN function, from here we start our application and run the Game loop
int main()
{
    // Init GLFW
    glfwInit();
    glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3);
    glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 3);
    glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE);
    glfwWindowHint(GLFW_RESIZABLE, GL_FALSE);

    GLFWwindow* window = glfwCreateWindow(screenWidth, screenHeight, "LearnOpenGL", nullptr, nullptr); // Windowed
    glfwMakeContextCurrent(window);

    // Set the required callback functions
    glfwSetKeyCallback(window, key_callback);

    // Initialize GLEW to setup the OpenGL Function pointers
    glewExperimental = GL_TRUE;
    glewInit();

    // Define the viewport dimensions
    glViewport(0, 0, screenWidth, screenHeight);

    // Setup OpenGL options
    //glEnable(GL_DEPTH_TEST);

    // Setup and compile our shader(s)
    Shader shader("core.vs", "core.frag");

    // Generate a list of 100 quad locations/translation-vectors
    glm::vec2 translations[NUM_INS];
    int index = 0;
    GLfloat offset = 1.0f/sqrt(NUM_INS);
    for (GLint y = -sqrt(NUM_INS); y < sqrt(NUM_INS); y += 2)
    {
        for (GLint x = -sqrt(NUM_INS); x < sqrt(NUM_INS); x += 2)
        {
            glm::vec2 translation;
            translation.x = (GLfloat)x / sqrt(NUM_INS) + offset;
            translation.y = (GLfloat)y / sqrt(NUM_INS) + offset;
            translations[index++] = translation;
        }
    }

    // Store instance data in an array buffer
    GLuint instanceVBO;
    glGenBuffers(1, &instanceVBO);
    glBindBuffer(GL_ARRAY_BUFFER, instanceVBO);
    glBufferData(GL_ARRAY_BUFFER, sizeof(glm::vec2) * NUM_INS, &translations[0], GL_STATIC_DRAW);
    glBindBuffer(GL_ARRAY_BUFFER, 0);

    // Generate quad VAO
    GLfloat quadVertices[] = {
        // Positions   // Colors
        -0.05f,  0.05f,  1.0f, 0.0f, 0.0f,
        0.05f, -0.05f,  0.0f, 1.0f, 0.0f,
        -0.05f, -0.05f,  0.0f, 0.0f, 1.0f,

        -0.05f,  0.05f,  1.0f, 0.0f, 0.0f,
        0.05f, -0.05f,  0.0f, 1.0f, 0.0f,
        0.05f,  0.05f,  0.0f, 0.0f, 1.0f
    };

    GLfloat vertices[] = {
        -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  1.0f, 0.0f, 0.0f,
        0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  1.0f, 0.0f, 0.0f,
        0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  1.0f, 1.0f, 0.0f,
        0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  1.0f, 1.0f, 0.0f,
        -0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.0f, 1.0f, 0.0f,
        -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.0f, 0.0f, 1.0f,

        -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  0.0f, 0.0f, 1.0f,
        0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  1.0f, 0.0f, 0.0f,
        0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  1.0f, 1.0f, 0.0f,
        0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  1.0f, 1.0f, 0.0f,
        -0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  0.0f, 1.0f, 0.0f,
        -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  0.0f, 0.0f, 1.0f,

        -0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  1.0f, 0.0f, 0.0f,
        -0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  1.0f, 1.0f, 0.0f,
        -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.0f, 1.0f, 0.0f,
        -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.0f, 1.0f, 0.0f,
        -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  0.0f, 0.0f, 0.0f,
        -0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  1.0f, 0.0f, 0.0f,

        0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  1.0f, 0.0f, 0.0f,
        0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  1.0f, 1.0f, 0.0f,
        0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.0f, 1.0f, 0.0f,
        0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.0f, 1.0f, 0.0f,
        0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  0.0f, 0.0f, 0.0f,
        0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  1.0f, 0.0f, 0.0f,

        -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.0f, 1.0f, 0.0f,
        0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  1.0f, 1.0f, 0.0f,
        0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  1.0f, 0.0f, 0.0f,
        0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  1.0f, 0.0f, 0.0f,
        -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  0.0f, 0.0f, 0.0f,
        -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.0f, 1.0f, 0.0f,

        -0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.0f, 1.0f, 0.0f,
        0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  1.0f, 1.0f, 0.0f,
        0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  1.0f, 0.0f, 0.0f,
        0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  1.0f, 0.0f, 0.0f,
        -0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS),  0.0f, 0.0f, 0.0f,
        -0.5f/sqrt(NUM_INS),  0.5f/sqrt(NUM_INS), -0.5f/sqrt(NUM_INS),  0.0f, 1.0f, 0.0f
    };

    GLuint quadVAO, quadVBO;
    glGenVertexArrays(1, &quadVAO);
    glGenBuffers(1, &quadVBO);
    glBindVertexArray(quadVAO);
    glBindBuffer(GL_ARRAY_BUFFER, quadVBO);
    glBufferData(GL_ARRAY_BUFFER, sizeof(vertices), vertices, GL_STATIC_DRAW);
    glEnableVertexAttribArray(0);
    glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 6 * sizeof(GLfloat), (GLvoid*)0);
    glEnableVertexAttribArray(1);
    glVertexAttribPointer(1, 3, GL_FLOAT, GL_FALSE, 6 * sizeof(GLfloat), (GLvoid*)(2 * sizeof(GLfloat)));
    // Also set instance data
    glEnableVertexAttribArray(2);
    glBindBuffer(GL_ARRAY_BUFFER, instanceVBO);
    glVertexAttribPointer(2, 2, GL_FLOAT, GL_FALSE, 2 * sizeof(GLfloat), (GLvoid*)0);
    glBindBuffer(GL_ARRAY_BUFFER, 0);
    glVertexAttribDivisor(2, 1); // Tell OpenGL this is an instanced vertex attribute.
    glBindVertexArray(0);


    // Game loop
    while (!glfwWindowShouldClose(window))
    {
        // Check and call events
        glfwPollEvents();

        // Clear buffers
        glClearColor(0.03f, 0.03f, 0.03f, 1.0f);
        glClear(GL_COLOR_BUFFER_BIT);

        // Draw 100 instanced quads
        shader.Use();
        glBindVertexArray(quadVAO);
        glDrawArraysInstanced(GL_TRIANGLES, 0, 36, NUM_INS); // 100 triangles of 6 vertices each
        glBindVertexArray(0);

        // Swap the buffers
        glfwSwapBuffers(window);
    }

    glfwTerminate();
    return 0;
}

// Is called whenever a key is pressed/released via GLFW
void key_callback(GLFWwindow* window, int key, int scancode, int action, int mode)
{
    if (key == GLFW_KEY_ESCAPE && action == GLFW_PRESS)
        glfwSetWindowShouldClose(window, GL_TRUE);
}

Vertex Shader: (named core.vs)

#version 330 core
layout (location = 0) in vec3 position;
layout (location = 1) in vec3 color;
layout (location = 2) in vec2 offset;

out vec3 fColor;

void main()
{
    gl_Position = vec4(position.x + offset.x, position.y + offset.y, position.z, 1.0f);
    fColor = color;
}

Fragment Shader: (named core.frag)

#version 330 core
in vec3 fColor;
out vec4 color;

void main()
{
    color = vec4(fColor, 1.0f);
}

Shader class: (named Shader.h)

#pragma once

// Std. Includes
#include <vector>

// GL Includes
#include <GL/glew.h>
#include <glm/glm.hpp>
#include <glm/gtc/matrix_transform.hpp>



// Defines several possible options for camera movement. Used as abstraction to stay away from window-system specific input methods
enum Camera_Movement {
    FORWARD,
    BACKWARD,
    LEFT,
    RIGHT
};

// Default camera values
const GLfloat YAW = -90.0f;
const GLfloat PITCH = 0.0f;
const GLfloat SPEED = 3.0f;
const GLfloat SENSITIVTY = 0.25f;
const GLfloat ZOOM = 45.0f;


// An abstract camera class that processes input and calculates the corresponding Eular Angles, Vectors and Matrices for use in OpenGL
class Camera
{
public:
    // Camera Attributes
    glm::vec3 Position;
    glm::vec3 Front;
    glm::vec3 Up;
    glm::vec3 Right;
    glm::vec3 WorldUp;
    // Eular Angles
    GLfloat Yaw;
    GLfloat Pitch;
    // Camera options
    GLfloat MovementSpeed;
    GLfloat MouseSensitivity;
    GLfloat Zoom;

    // Constructor with vectors
    Camera(glm::vec3 position = glm::vec3(0.0f, 0.0f, 0.0f), glm::vec3 up = glm::vec3(0.0f, 1.0f, 0.0f), GLfloat yaw = YAW, GLfloat pitch = PITCH) : Front(glm::vec3(0.0f, 0.0f, -1.0f)), MovementSpeed(SPEED), MouseSensitivity(SENSITIVTY), Zoom(ZOOM)
    {
        this->Position = position;
        this->WorldUp = up;
        this->Yaw = yaw;
        this->Pitch = pitch;
        this->updateCameraVectors();
    }
    // Constructor with scalar values
    Camera(GLfloat posX, GLfloat posY, GLfloat posZ, GLfloat upX, GLfloat upY, GLfloat upZ, GLfloat yaw, GLfloat pitch) : Front(glm::vec3(0.0f, 0.0f, -1.0f)), MovementSpeed(SPEED), MouseSensitivity(SENSITIVTY), Zoom(ZOOM)
    {
        this->Position = glm::vec3(posX, posY, posZ);
        this->WorldUp = glm::vec3(upX, upY, upZ);
        this->Yaw = yaw;
        this->Pitch = pitch;
        this->updateCameraVectors();
    }

    // Returns the view matrix calculated using Eular Angles and the LookAt Matrix
    glm::mat4 GetViewMatrix()
    {
        return glm::lookAt(this->Position, this->Position + this->Front, this->Up);
    }

    // Processes input received from any keyboard-like input system. Accepts input parameter in the form of camera defined ENUM (to abstract it from windowing systems)
    void ProcessKeyboard(Camera_Movement direction, GLfloat deltaTime)
    {
        GLfloat velocity = this->MovementSpeed * deltaTime;
        if (direction == FORWARD)
            this->Position += this->Front * velocity;
        if (direction == BACKWARD)
            this->Position -= this->Front * velocity;
        if (direction == LEFT)
            this->Position -= this->Right * velocity;
        if (direction == RIGHT)
            this->Position += this->Right * velocity;
    }

    // Processes input received from a mouse input system. Expects the offset value in both the x and y direction.
    void ProcessMouseMovement(GLfloat xoffset, GLfloat yoffset, GLboolean constrainPitch = true)
    {
        xoffset *= this->MouseSensitivity;
        yoffset *= this->MouseSensitivity;

        this->Yaw += xoffset;
        this->Pitch += yoffset;

        // Make sure that when pitch is out of bounds, screen doesn't get flipped
        if (constrainPitch)
        {
            if (this->Pitch > 89.0f)
                this->Pitch = 89.0f;
            if (this->Pitch < -89.0f)
                this->Pitch = -89.0f;
        }

        // Update Front, Right and Up Vectors using the updated Eular angles
        this->updateCameraVectors();
    }

    // Processes input received from a mouse scroll-wheel event. Only requires input on the vertical wheel-axis
    void ProcessMouseScroll(GLfloat yoffset)
    {
        if (this->Zoom >= 1.0f && this->Zoom <= 45.0f)
            this->Zoom -= yoffset;
        if (this->Zoom <= 1.0f)
            this->Zoom = 1.0f;
        if (this->Zoom >= 45.0f)
            this->Zoom = 45.0f;
    }

private:
    // Calculates the front vector from the Camera's (updated) Eular Angles
    void updateCameraVectors()
    {
        // Calculate the new Front vector
        glm::vec3 front;
        front.x = cos(glm::radians(this->Yaw)) * cos(glm::radians(this->Pitch));
        front.y = sin(glm::radians(this->Pitch));
        front.z = sin(glm::radians(this->Yaw)) * cos(glm::radians(this->Pitch));
        this->Front = glm::normalize(front);
        // Also re-calculate the Right and Up vector
        this->Right = glm::normalize(glm::cross(this->Front, this->WorldUp));  // Normalize the vectors, because their length gets closer to 0 the more you look up or down which results in slower movement.
        this->Up = glm::normalize(glm::cross(this->Right, this->Front));
    }
};
genpfault
  • 51,148
  • 11
  • 85
  • 139
Paul Terwilliger
  • 1,596
  • 1
  • 20
  • 45
  • 3
    Nice and repeatable from the sounds of things. What sayeth [the debugger](https://en.wikipedia.org/wiki/Debugger)? – user4581301 Sep 26 '16 at 16:37
  • 1
    Please include the source code in the the question itself. Otherwise this question is useless when the pastebins are deleted. – BDL Sep 26 '16 at 18:00
  • 2
    You are allocating the objects as local variables, thus on the stack, which is not made to take those many objects. Allocate your object array on the heap (you may use an `std::vector`). – Matteo Italia Sep 26 '16 at 19:14
  • @MatteoItalia How do I do that? – Paul Terwilliger Sep 26 '16 at 19:58

2 Answers2

4

First, I must say that your Shader class is camera code, But I also learned from that tutorial, so simply change it by myself.

The problem you want to slove is relative your system stack size. In visual studio, only allowed you to make a local variable size in 1MB, and your program overflows when setting NUM_INS to 160000.

True Solution(Edited)

Like @Matteo Italia said, use the std::vector instead, or just change your array init part glm::vec2 translations[NUM_INS]; to glm::vec2* translations = new glm::vec2[NUM_INS];, and don't forget the delete when you won't use it. I test the second way, it could work. Sorry for my previous bad answer, I should learn more about heap and stack!

For who doesn't understand the background, I found ref1 ,ref2 for learning.


Worst Solution(previous, should not use)

To slove the problem, you could change the visual studio setting by following steps:

  1. right click on your project -> settings
  2. go to linker -> system
  3. set the heap reserved size to 2097152 (2M)

Note that my editor is Chinese, so I don't know exactly the name for the detail. By setting this, you could set NUM_INS to 160,000 or more and see the result like this:

Community
  • 1
  • 1
Tokenyet
  • 4,063
  • 2
  • 27
  • 43
  • 3
    That's the *stack* size, and increasing it is almost never the solution. OP should simply allocate his stuff on the heap instead of the stack. – Matteo Italia Sep 26 '16 at 19:12
  • @Tokenyet Thank you, your comment helped me solve my problem. I understand that this is a different question, should I post it as a new one? My code with the cubes is running at ~30 fps, but I expected it to be higher. Is there anything simple I can do to speed it up? – Paul Terwilliger Sep 26 '16 at 20:01
  • @Paul Terwilliger you should make a new post with how you implement fps counter code and other detail. For here, I just want to remind you something, check your [vsync](http://stackoverflow.com/questions/11312318/my-limited-fps-60), and other limit setting.If you did it, than go ahead for new post :) – Tokenyet Sep 26 '16 at 20:13
  • 1
    This is really not the right answer in this case; it might work, but it is a fragile solution. – RamblingMad Sep 27 '16 at 07:15
  • @Tokenyet Okay, I have made a new post regarding frames-per-second. You will find it here: http://stackoverflow.com/questions/39752685/instancing-millions-of-objects-in-opengl-improving-frames-per-second Thanks for all of your help! – Paul Terwilliger Sep 28 '16 at 16:13
  • Because of memory limitations; trying to instance say millions is a little bit beyond what most processors and video cards can handle. Currently something in the affect that the user would like to do would be to instance as many as they can support without crashing, then do something that is called `batch rendering`. – Francis Cugler Jan 01 '18 at 06:29
2

Here

glm::vec2 translations[NUM_INS];

you are allocating your array of positions on the stack; now, as long as NUM_INS is relatively small this is not such a big problem, but when you start going to "big" numbers (say, 100000) the stack just can't take it.

Given that each glm::vec2 element is made of a pair of 32 bit floats (so, each vec2 is 8 byte), 160000 elements take 1.28 MB, which overflow the stack (1 MB on Windows with the default linker settings).

The solution to this problem is not to increase the stack size: the stack is intentionally limited in size, and is not optimized for taking big objects. Instead, you should allocate your elements on the heap, which allows you to exploit all the virtual memory that is available for your process.

To do this, either use new/delete or - more simply - learn to use the std::vector class:

std::vector<glm::vec2> translations(NUM_INS);

The rest of your code should work as is.

Matteo Italia
  • 123,740
  • 17
  • 206
  • 299