labels in an opengl map application

Question

Short Version

How can I draw short text labels in an OpenGL mapping application without having to manually recompute coordinates as the user zooms in and out?

Long Version

I have an OpenGL-based mapping application where I need to be able to draw data sets with up to about 250k points. Each point can have a short text label, usally about 4 or 5 characters long.

Currently, I do this using a single textue containing all the characters. For each point, I define a quad for each character in its label. So a point with the label "Fred" would have four quads associated with it, and each quad uses texture coordinates into that single texture to draw its corresponding character.

When I draw the map, I draw the map points themselves in map coordinates (e.g., longitude/latitude). Then I compute the position of each point in screen coordinates and update the four corner points for each of that point's label quads, again in screen coordinates. (For instance, if I determine the point is drawn at screen point 100, 150, I could set the quad for the first character in the point's label to be the rectangle starting with left-top point of 105, 155 and having a width of 6 pixels and a height of 12 pixels, as appropriate for the particular character. Then the second character might start at 120, 155, and so on.) Then once all these label character quads are positioned correctly, I draw them using an orthogonal screen projection.

The problem is that the process of updating all of those character quad coordinates is slow, taking about half a second for a particular test data set with 150k points (meaning that, since each label is about four characters long, there are about 150k * [ 4 characters per point] * [ 4 coordinate pairs per character] coordinate pairs that need to be set on each update.

If the map application didn't involve zooming, I would not need to recompute all these coordinates on each refresh. I could just compute the label coordinates once and then simply shift my viewing rectangle to show the right area. But with zooming, I can't see how to make it work without doing coordniate computation, because otherwise the characters will grow huge as you zoom in and tiny as you zoom out.

What I want (and what I understand OpenGL doesn't provide) is a way to tell OpenGL that a quad should be drawn in a fixed screen-coordinate rectangle, but that the top-left position of that rectangle should be a fixed distance from a given point in map coordinate space. So I want both a primitive hierarchy (a given map point is that parent of its label character quads) and the ability to mix two different coordinate systems within this hierarchy.

I'm trying to understand whether there is some magic transformation matrix I can set that will do all this form me, but I can't see how to do it.

The other alternative I've considered is using a shader on each point to handle computing the label character quad coordinates for that point. I haven't worked with shaders before, and I'm just trying to understand (a) if it's possible to use shaders to do this, and (b) whether computing all those points in shader code actually buys me anything over computing them myself. (By the way, I have confirmed that the big bottleneck is computing the quad coordinates, not in uploading the updated coordinates to the GPU. The latter takes a bit of time, but it's the computation, the sheer number of coordinates being updated, that takes up the bulk of that half second.)

(Of course, the other other alternative is to be smarter about which labels need to be drawn in a given view in the first place. But for now I'd like to concentrate on the solution assuming all labels need to be drawn.)

possible duplicate of [technique and speed expectations for opengl text labeling](http://stackoverflow.com/questions/5905055/technique-and-speed-expectations-for-opengl-text-labeling) — datenwolf, Jun 16 '11 at 06:15

mgiuca · Answer 1 · 2011-06-16T08:40:42.203

So the basic problem ("because otherwise the characters will grow huge as you zoom in and tiny as you zoom out") is that you are doing calculations in map coordinates rather than screen coordinates? And if you did it in screen coords, this would require more computations? Obviously, any rendering needs to translate from map coordinates to screen coordinates. The problem seems to be that you are translating from map to screen too late. Therefore, rather than doing a single map-to-screen for each point, and then working in screen coords, you are working mostly in map coords, and then translating per-character to screen coords at the very end. And the slow part is that you are working in screen coords, then having to manually translate back to map coords just to tell OpenGL the map coords, and it will convert those back to screen coords! Is that a fair assessment of your problem?

The solution therefore is to push that transformation earlier in your pipeline. However, I can see why it is tricky, because at first glance, OpenGL seems want to do everything in "world coordinates" (for you, map coords), but not in screen coords.

Firstly, I am wondering why you are doing separate coordinate calculations for each character. What font rendering system are you using? Something like FreeType will automatically generate a bitmap image of an entire string, and doesn't require you to work per-character [edit: this isn't quite true; see comments]. You definitely shouldn't need to calculate the map coordinate (or even screen coordinate) for every character. Calculate the screen coordinate for the top-left corner of the label, and have your font rendering system produce the bitmap of the entire label in one go. That should speed things up about fourfold (since you assume 4 characters per label).

Now as for working in screen coords, it may be helpful to learn a bit about shaders. The more you learn about OpenGL, the more you learn that really it isn't a 3D rendering engine at all. It's just a 2D graphics library with some very fast matrix primitives built-in. OpenGL actually works, at the lowest level, in screen coordinates (not pixel coordinates -- it works in normalized screen space, I think from memory from -1 to 1 in both the X and Y axis). The only reason it "feels" like you're working in world coordinates is because of these matrices you have set up.

So I think the reason why you are working in map coords all the way until the end is because it's easiest: OpenGL naturally does the map-to-screen transform for you (using the matrices). You have to change that, because you want to work in screen coords yourself, and therefore you need to make the transformation a long time before OpenGL gets its hands on your data. So when you go to draw a label, you should manually apply the map-to-screen transformation matrix on each point, as follows:

You have a particular point (which needs a label drawn) in map coords.
Apply the map-to-screen matrix to convert the point to screen coords. This probably means multiplying the point by the MODELVIEW and PROJECTION matrices, using the same algorithm that OpenGL does when it's rendering a vertex. So you could either glGet the GL_MODELVIEW_MATRIX and GL_PROJECTION_MATRIX to extract OpenGL's current matrices, or you could manually keep around a copy of the matrix yourself.
Now you have the map label in screen coords, compute the position of the label's text. This is simply adding 5 pixels in the X and Y axis, as you said above. However, remember that you aren't in pixel space, but normalised screen space, so you are working in percentages (add 0.05 units, would add 5% of the screen space, for example). It's probably better not to think in pixels, because then your application will scale to match the resolution. But if you really want to think in pixels, you will have to calculate the pixels-to-units based on the resolution.
Use glPushMatrix to save the current matrix, then glLoadIdentity to set the current matrix to the identity -- tell OpenGL not to transform your vertices. (I think you will have to do this for both the PROJECTION and MODELVIEW matrices.)
Draw your label, in screen coordinates.

So you don't really need to write a shader. You could certainly do this in a shader, and it would certainly make step 2 faster (no need to write your own software matrix multiply code; multiplying matrices on the GPU is extremely fast). But that would be a later optimisation, and a lot of work. I think the above steps will help you work in screen coordinates and avoid having to waste a lot of time just to give OpenGL map coordinates.

It's been a while since I used FreeType2, but I don't recall the API that allows you to convert an entire string (in what encoding?) into a single bitmap. FreeType2 is mostly for getting characteristics and renderings of individual glyphs; layout, which is what string rendering is, should be done by the user. The rest of your points are valid; you can generate the "mesh" for a text string relative to a point, and you won't need to compute each character individually; you just nee to find the point in window-space to hook this label to. — Nicol Bolas, Jun 16 '11 at 06:31
@Nicol Bolas My mistake: I'm used to higher-level APIs like SDL_TTF. Indeed, FreeType only lets you do one glyph at a time. But you can still render it into a bitmap yourself before letting it have anything to do with the OpenGL coordinate space. — mgiuca, Jun 16 '11 at 08:40
@mgiuca, thanks for your detailed response. Unfortunately, I don't think you addressed my issue. I'm already doing what you said, handling all points related to the characters in screen coordinates. For each map point, I just do one world-to-screen calc to get the screen coords of the point. But then (and this is the part that's taking so long) I still need to update each of the screen coords for each character quad of the point's label. It's not that I'm doing a world-to-screen calc for each char quad point. But just adding the little pixel (+5, +12, etc.) offsets is what's taking so long. — M Katz, Jun 16 '11 at 18:05
@Nicol Bolas, when you say "you just nee to find the point in window-space to hook this label to", do you mean anything special by "hook"? As in my comment to mgiuca, for me "hooking" means I only have to do the world-to-screen calculation once for the map point, and then the character quad coordinates are just little screen offsets from that. But still I need to update all those little character quad coordinates manually, right? There is no way to "hook" them automagically, right? (i.e., to have GL update them automatically because they are a preset offset from that one computed screen point) — M Katz, Jun 16 '11 at 18:10
Those window-space offsets are *fixed*. They do not change from one rendering to another. You generate them when you first generate the string, but unless the content of the label changes, you don't need to touch them anymore. You just have a list of quads. Once you've set your transformation matrix so that you are working in window-space, just set an offset to the matrix to position yourself to the label's location and render the quads. — Nicol Bolas, Jun 16 '11 at 21:32
@Nicol Bolas, right, well they are fixed as far as map panning goes: when the map pans I need only update a single matrix and they all shift to the right place. But for the general case of zooming, they are not fixed, right? — M Katz, Jun 18 '11 at 01:10
@ M Katz Do you intend for zoom to make the text get bigger? If not, then no. And if you do, then you can just apply a scaling matrix after the translation matrix. After all, you're not actually regenerating the glyphs (are you?); you're just making the quads bigger. Scale matrices can do that just fine. — Nicol Bolas, Jun 18 '11 at 01:45
I think @Nicol Bolas is right. So my question is ARE you rendering the label to a texture? If not, maybe that is the solution. You would build the texture once when the label becomes visible. Then per-frame you are only worrying about where to draw the texture on-screen; nothing to do with fonts or characters at all. — mgiuca, Jun 20 '11 at 01:36

score 1 · Accepted Answer · answered Aug 01 '11 at 19:21

Just to follow up on the resolution:

I didn't really solve this problem, but I ended up being smarter about when I draw labels in the first place. I was able to quickly determine whether I was about to draw too many characters (i.e., so many characters that on a typical screen with a typical density of points the labels would be too close together to read in a useful way) and then I simply don't label at all. With drawing up to about 5000 characters at a time there isn't a noticeable slowdown recomputing the character coordinates as described above.

score 0 · Answer 3 · answered Jun 16 '11 at 18:26

Side comment on:

""" generate a bitmap image of an entire string, and doesn't require you to work per-character ... Calculate the screen coordinate for the top-left corner of the label, and have your font rendering system produce the bitmap of the entire label in one go. That should speed things up about fourfold (since you assume 4 characters per label). """

Freetype or no, you could certainly compute a bitmap image for each label, rather than each character, but that would require one of:

storing thousands of different textures, one for each label
- It seems like a bad idea to store that many textures, but maybe it's not.
or
rendering each label, for each point, at each screen update.
- this would certainly be too slow.

labels in an opengl map application

3 Answers3