17

I have many particles who's vertices change every frame. The vertices are currently being drawn using a vertex array in 'client' memory. What performance characteristics can I expect if I use a vertex buffer object?

Since I have to use a number of glBuffersubData's to update the particle vertices, I am therefore transferring the vertices to video memory every frame anyway right(like i would if i use a regular vertex array)? Is there any benefit to VBO's in this case?

This is for iOS devices. The actual draw call: glDrawElements(GL_POINTS,num_particles,GL_UNSIGNED_SHORT,pindices);

Should I use GL_STREAM_DRAW or GL_DYNAMIC_DRAW?

bobobobo
  • 64,917
  • 62
  • 258
  • 363
Ari Ronen
  • 2,222
  • 2
  • 22
  • 24

2 Answers2

5

Apple's documentation appears to recommend VBOs in all situations. If you're using ES 2.x then the GL_STREAM_DRAW vertex buffer type is explicitly for "when your application needs to create transient geometry that is rendered a small number of times and then discarded. This is most useful when your application must dynamically change vertex data every frame in a way that cannot be performed in a vertex shader." Use of glBufferSubData is then directly advocated.

Logically, I guess the only difference between supplying the data completely afresh and sending it to an existing GL_STREAM_DRAW or GL_DYNAMIC_DRAW buffer is that your space in the memory map (GPU or CPU, depending on the chip — MBXs don't really do VBOs but Apple supports them for other performance reasons) can be allocated once rather than allocated and released every frame.

Using the alignment and packing tips given in that document is likely to give a better improvement than a switch to VBOs, since otherwise the CPU just has to unpack and repack data upon glDrawElements. Though quite probably you're already aware of that and I appreciate that it isn't directly part of the question — I mainly throw it in as a comparative guess about performance benefits.

Tommy
  • 99,986
  • 12
  • 185
  • 204
  • What the CPU has to do depends on the graphical chip, and the driver. The alignment has nothing to do with performances. Data type does, and it is the driver that does repacking. The point is to use the types that the intern data types in the graphical chip, and then the driver has nothing to do. – BЈовић Nov 03 '10 at 22:10
  • On the contrary, Apple's documentation, as linked to in my answer, explicitly states "Unaligned data requires significantly more processing, particularly when your application uses vertex buffers." See the very final sentence. Assuming you're an ADC member, see also e.g. https://devforums.apple.com/message/320610#320610, which is an Apple employee again explaining the speed benefits of alignment. Memory controllers often read unaligned values by reading two neighbouring values and combining the results — hence one read costs two memory fetches rather than one. – Tommy Nov 03 '10 at 22:36
  • Ok, I have no idea about iphone and I wasn't aware the question is for that. My answer is general for opengl (for PCs). – BЈовић Nov 03 '10 at 23:15
  • Oh, yes, completely agree. Alignment is a factor here only because Apple have been nice enough to document driver and hardware specifics above and beyond the GL spec. Hope I didn't sound rude. – Tommy Nov 03 '10 at 23:41
  • Unfortunately I'm using ES 1.1 so I can't use STREAM_DRAW. I'm still not convinced that the VBO is faster, since I would have to do one subbuffer for each particle index(maybe less if I get creative with my arrays). Or one big sub-buffer for everything. Where with no VBO its only going to allocate/copy for 'active particles'(based on the number of indices i send). Granted this is a very specific optimization case and I will do some testing if it becomes a bottleneck. Thanks for finding the official word from apple though. – Ari Ronen Nov 04 '10 at 19:34
  • Given that you plan to test and without being able to discuss such beta software as may be available to members of the developer programme under NDA, I think it might be relevant to repeat what Apple have said publicly (eg, http://developer.apple.com/technologies/tools/whats-new.html) re: what's in the new version of Instruments for Xcode 4 — "[n]ew data collection instruments are also available, including OpenGL ES for tracking iPhone graphics performance". – Tommy Nov 04 '10 at 20:31
2

By setting VBOs properly, you are using optimal way of transferring data to the GPU. By doing so, you might skip some driver processing. The only way to see how much you get of improvement you get is to measure. It is different from card to card.

For VBO how-to, see this : VBO tutorial

EDIT Forgot to answer the question : yes, it is a good idea. But first measure.

ino
  • 273
  • 2
  • 13
BЈовић
  • 62,405
  • 41
  • 173
  • 273
  • I don't think its true that VBO's are always optimal, the overhead caused by numerous BufferSubDatas or even worse mapBuffer(because it has to copy the buffer back to the client) could be worse than the overhead from vertex arrays. It's true that it really deserves a thorough testing per videocard to know for sure, I'm just seeing if I can draw from the knowledge of the crowds... since its not worth the testing effort in my case at this point. – Ari Ronen Nov 03 '10 at 20:51