3

I have a 3D texture where I write data and use it as voxels in the fragment shader in this way:

#extension GL_ARB_shader_image_size : enable
...
layout (binding = 0, rgba8) coherent uniform image3D volumeTexture;
...
void main(){
vec4 fragmentColor = ...
vec3 coords = ...
imageStore(volumeTexture, ivec3(coords), fragmentColor);
}

and the texture is defined in this way

glGenTextures(1, &volumeTexture);
glBindTexture(GL_TEXTURE_3D, volumeTexture);
glTexImage3D(GL_TEXTURE_3D, 0, GL_RGBA8, volumeDimensions, volumeDimensions, volumeDimensions, 0, GL_RGBA, GL_UNSIGNED_BYTE, 0);

and then this when I have to use it

glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_3D, volumeTexture);

now my issue is that I would like to have a mipmapped version of this and without using the opengl function because I noticed that it is extremely slow. So I was thinking of writing in the 3D texture at all levels at the same time so, for instance, the max resolution is 512^3 and as I write 1 voxel VALUE in that 3dtex I also write 0.125*VALUE for the 256^3 voxel and 0.015625*VALUE for the 126^3 etc. Since I am using imageStore, which uses atomicity all values will be written and using these weights I would automatically get the average value (not exactly like an interpolation but i might get a pleasing result anyway). So my question is, what is the best way to have multiple 3dtextures and writing in all of them at the same time?

jozxyqk
  • 16,424
  • 12
  • 91
  • 180
tigeradol
  • 259
  • 2
  • 16

1 Answers1

2

I believe hardware mipmapping is about as fast as you'll get. I've always assumed attempting custom mipmapping would be slower given you have to bind and rasterize to each layer manually in turn. Atomics will give huge contention and it'll be amazingly slow. Even without atomics you'd be negating the nice O(log n) construction of mipmaps.

You have to be really careful with imageStore with regard to access order and cache. I'd start here and try some different indexing (eg row/column vs column/row).

You could try drawing to the texture the older way, by binding it to an FBO and drawing a full screen triangle (big triangle that covers the viewport) with glDrawElementsInstanced. In the geometry shader, set gl_Layer to the instance ID. The rasterizer creates fragments for x/y and the layer gives z.

Lastly, 512^3 is simply a huge texture even by todays standards. Maybe find out your theoretical max gpu bandwidth to get an idea of how far away you are. E.G. lets say your GPU can do 200GB/s. You'll probably only get 100 in a good case anyway. Your 512^3 texture is 512MB so you might be able to write to it in ~5ms (imo this seems awfully fast, maybe I made a mistake). Expect some overhead and latency from the rest of the pipeline, spawning and executing threads etc. If you're writing complex stuff then memory bandwidth isn't the bottleneck and my estimation goes out the window. So try just writing zeroes first. Then try changing the coords xyz order.


Update: Instead of using the fragment shader to create your threads, the vertex shader can be used instead, and in theory avoids rasterizer overhead though I've seen cases where it doesn't perform as well. You glEnable(GL_RASTERIZER_DISCARD), glDrawArrays(GL_POINTS, 0, numThreads) and use gl_VertexID as your thread index.

jozxyqk
  • 16,424
  • 12
  • 91
  • 180
  • I am using imageStore because I am voxelising the scene using geom+frag in one pass (Cyril Crassin's method) and it works perfectly and runs at 10fps for the 512^3 (i have a gt540m). The reason for using imagestore is that 2 triangles can write to the same voxel. – tigeradol Jun 26 '14 at 13:26
  • @tigeradol instead of drawing 2 triangles to get a perfect quad, draw a single massive one and just let it get clipped to the viewport (I'll udate the answer). 10fps doesn't seem too bad, how long does it take to mipmap? – jozxyqk Jun 26 '14 at 13:29
  • I see your point but I don't understand how am I going to write color values inside the texture if i don't use the frag shader. Regarding mip map it drops to less than 1 fps. In 256^3 it runs at 29 fps (37 if I don't do shadowmapping and light calculation) and drops to 6 with mipmap – tigeradol Jun 26 '14 at 13:34
  • 1
    @tigeradol You can write colour values with `imageStore`, the same as you do in the fragment shader, however you'd use `gl_VertexID` to build the `coords` index. I don't think mipmap generation should be any slower than the time taken to compute and write to level 0. Perhaps driver bug/limitation? You might be better off doing it manually. I have seen performance cliffs like this when running out of GPU memory as I guess textures start being swapped out per frame (though ur 1GB tex+mips should leave you 512mb free). – jozxyqk Jun 27 '14 at 02:47
  • Hi I just tried to write to multiple textures manually and it is actually very fast. These are the results: no MIPMAP 512^3: 10 fps openGL MIPMAP: 1.9 fps manual write: 9 fps and if I run it at 256^3 i lose only 1 frame. What I did is writing without any interpolation in the lower levels, and the result is still good, maybe that is why it is going faster. Next I will experiment with your suggestion of using vertex ID's. Thanks – tigeradol Jun 27 '14 at 12:42
  • Can I contact you privately in case I have any trouble, so that we don't spam the thread? – tigeradol Jun 27 '14 at 13:57
  • @tigeradol Sure: http://chat.stackoverflow.com/rooms/56509/opengl-writing-to-and-mipmapping-3d-textures – jozxyqk Jun 30 '14 at 07:05