Data structure to query points which lie inside a triangle

Question

I have some 2D data which contains edges which were rasterized into pixels. I want to implement an efficient data structure which returns all edge pixels which lie in a non-axis-aligned 2D triangle.

Spatial query for sparse data

The image shows a visualization of the problem where white denotes the rasterized edges, and red visualizes the query triangle. The result would be all white pixels which lie on the boundary or inside the red triangle.

When further looking at the image, one notices that we have sparse boolean data, meaning that if we denote black pixels with a 0 and white pixels with a 1, that the number of 1s in the data is much lower than the number of 0s. Therefore, rasterizing the red triangle and checking for each point on it's inside whether it is white or black is not the most efficient approach.
Besides the sparseness of the data; since the white pixels origin from edges, it is in their nature to be connected together. However, at junctions with other lines, they have more than two neighbors. Pixels which are at a junction should only be returned once.
The data must be processed in realtime, but with no GPU assistance. There will be multiple queries for different triangle contents, and after each one, points may be removed from the data structure. However, new points won't be inserted anymore after the initial filling of the data structure.
The query triangles are already known when the rasterized edges arrive.
There are more query triangles than data edges.

There are many spatial data structures available. However I'm wondering, which one is the best one for my problem. I'm willing to implement a highly optimized data structure to solve this problem, as it will be a core element of the project. Therefore, also mixes or abbreviations of data structures are welcome!

R-trees seem to be the best data structure which I found for this problem until now as they provide support for rectangle-based queries. I would check for all white pixels within an AABB of the query triangle, then would check for each returned pixel if it lies within the query rectangle.

However, I'm not sure how well R-trees will behave since edge-based data will not be easily groupable into rectangles, as the points are clumped together on narrow lines and not pread out.

I'm alo not sure if it would make sense to pre-build the structure of the R-tree using information about the query triangles which will be made as soon as the structure is filled (as mentioned before, the query triangles are already known when the data arrives).
Reversing the problem seems also to be a valid solution, where I use a 2-dimensional interval tree to get for each white pixel a list of all triangles which contain it. Then, it can already be stored within all those result sets and be returned instantly when the query arrives. However, I'm not sure how this performs a the number of triangles is higher than the number of edges, but still lower than the number of white pixels (as an edge is mostly split up into ~20-50 pixels).
A data structure which would exploit that white pixels have most often white pixels as neighbors would seem to be most efficient. However, I could not find anything about such a thing until now.

score 1 · Answer 1 · edited May 23 '17 at 11:48

1

Decompose the query triangle(s) into n*3 lines. For every point under test you can estimate at which side of every line it is. The rest is boolean logic.

EDIT: since your points are rasterised, you could precompute the points on the scanlines where the scanline enters or leaves a particular query triangle (=crosses one of the 3n lines above && is on the "inside" of the other two lines that participate in that particular triangle)

UPDATE: Triggered by another topic ( How can I find out if point is within a triangle in 3D? ) I'll add code to prove that a non-convex case can be expressed en terms of "which side of every line a point is on". Since I am lazy, I'll use an L-shaped form. IMHO other Non-convex shapes can be processed similarly. The lines are parallel to the X- and Y- axes, but that again is laziness.

/*

Y
| +-+
| | |
| | +-+
| |   |
| +---+
|
0------ X
the line pieces:
Horizontal:
(x0,y0) - (x2,y0)
(x1,y1) - (x2,y1)
(x0,y2) - (x1,y2)
Vertical:
(x0,y0) - (x0,y2)
(x1,y1) - (x1,y2)
(x2,y0) - (x2,y1)

The lines:
(x==x0)
(x==x1)
(x==x2)
(y==y0)
(y==y1)
(x==y2)

Combine them:
**/

#define x0 2
#define x1 4
#define x2 6

#define y0 2
#define y1 4
#define y2 6

#include <stdio.h>

int inside(int x, int y)
{   

switch(  (x<x0 ?0:1)
    +(x<x1 ?0:2)
    +(x<x2 ?0:4)
    +(y<y0 ?0:8)
    +(y<y1 ?0:16)
    +(y<y2 ?0:32) ) {

case 1+8:
case 1+2+8:
case 1+8+16:
    return 1;
default: return 0;
    }
}

int main(void)
{
int xx,yy,res;
while (1) {
     res = scanf("%d %d", &xx, &yy);
     if (res < 2) continue;
     res = inside(xx, yy);
     printf("(%d,%d) := %d\n", xx, yy,res);
    }
return 0;
}

edited May 23 '17 at 11:48

Community

1
1

answered Nov 30 '11 at 15:44

wildplasser

43,142
8
66
109

A scanline approach won't be fast. Imagine an approach with a horizontal scanline in the example above. Most white pixels won't be inside the triangle. Runtime complexity would be O(#white pixels) / query. – Etan Nov 30 '11 at 16:35
I don't suggest a scanline approach. Since you refer to them as "pixels", I am assuming that your Y-values are quantised enough to *treat* them as scanlines. How many *distinct* Y-values are there? It are integers? – wildplasser Nov 30 '11 at 17:05
Like in the picture in the post, the pixels can be visualized as an image. The whole image will be around VGA sized, so 640x480. So the edge pixels will have at maximum 480 distinct Y values. Could you elaborate your answer further? I don't seem to understand the idea of yours yet. Maybe an example would be nice to accompany your thoughts. – Etan Nov 30 '11 at 17:58
Well: for every of the 480 possible Y-coordinates you can construct a decision tree (or other structure) which helps you to detect whether a given X-coordinate **on that particular scanline** is inside one of your query boxes. This state (can) change every time the scan crosses one of the lines. – wildplasser Nov 30 '11 at 18:09
The problem is the realtime requirement. I may only use about 1ms to build the structure and answer all queries. Also, there are multiple queries, so I would have to build a complete decision tree for each of the query triangles – Etan Nov 30 '11 at 20:46
"I would have to build a complete decision tree for each of the query triangles" Wrong: you can base a decision-tree for the combined set of queries. (the first paragraph in my reply) Extend all the (3*N) lines for the N query towards infinity. Now every point on the plane has 2**(3*N) possible statuses, depending on whether is is left/right from the 3*N lines. These "statuses" can be checked (bitwise: every triangle uses three bits) to test for a point being in/out a particular triangle. How many query 3angles do you expect? How many {x,y} white pixels? – wildplasser Nov 30 '11 at 21:03
I expect about 100 triangles and about 200 edges which are rasterized into the image. when we take about 25 pixels / edge this will lead to 5k white pixels, 100 test triangles and 300k black pixels. – Etan Nov 30 '11 at 21:24
Jep. About 100 query triangles is correct. They may also be partly overlapping. – Etan Nov 30 '11 at 22:22

score 0 · Answer 2 · answered Nov 30 '11 at 23:56

There are a couple computational-geometric algorithms that I think in tandem would give good results.

Compute a planar subdivision that contains all of the triangle edges. (This is a little more complicated than computing all intersections of triangle edges.) For each face, make a list of the triangles that contain that face. This is admittedly worst-case cubic, but that's only when the triangles overlap a lot (and I can't help but think that there's a way to compress it to quadratic).
Locate each pixel in the subdivision (i.e., figure out which face it belongs to). The first one in each edge will cost O(log n), but if you have locality thereafter, there may be a way to shortcut the computation to something like O(1) on average. (For example, if you use the trapezoid method and if you store the list of trapezoids that contained the last point, you can traverse up the list until you find a trapezoid that contains the current point and work back down. Compare giving hints to C++ STL set insertion by passing an iterator near the insertion point.)

Data structure to query points which lie inside a triangle

2 Answers2

Linked