11

I have a set of latitude and longitude for various locations and also know the latitude and longitude of my current location. I have to find out the nearest places from current location.

  • Which algorithm is best one from Kdtree and quadtree to find out the neighbour locations from the set of latitudes and longitudes?
  • What are the advantage of one over other?
  • How can we implement these to algorithms in C# for above purpose?
halfer
  • 19,824
  • 17
  • 99
  • 186
Kesiya Abraham
  • 753
  • 3
  • 10
  • 24

3 Answers3

29

Comparing spatial indexing techniques I would like to bring a 3rd one into our comparative study which is called Grid indexing. and in order to understand Quad-Tree I would like to go into Grid indexing first.

What is Grid indexing?

Grid indexing is a grid-base spatial indexing method in which the study area is divided into fixed size tiles (fixed dimensions) like chess board.

Using Grid index, every point in a tile are tagged with that tile number, so the Index table could provide you a tag for each point showing the tile which our number falls in.

RTree Tiles

Imagine a situation in which you need to find points in a given rectangle. This query is performed in two steps :

  1. Find the tiles that the rectangle is overlapping, and the points in tiles (first filter)
  2. Finding the candidate points in above step that actually lay in our rectangle. this need to be done accurately using points and rectangle coordinates.(second filter)

The first filter creates a set of candidates and prevents to test all points in our study area to be checked one after the other.

The second filter is the accurate check and uses rectangle coordinates to test the candidates.

R Tree, Rectangle Query

Now, Take a look at the tiles in above pictures, what happens if the tiles are very big or very Small?

When the tiles are too big, for example assume that you have a tile with equal size of your study area, which makes only one tile! so the first filter is practically useless, the whole processing load will be burden by the second filter. In this case the first filter is fast and the second filter is very slow.

Now imagine that the tiles are very small, in this case the first filter is very slow and practically it generates the answer it self, and the second filter is fast.

Determining the tile size is very important and affects the performance directly but what if you can not determine the best tile dimension? what if you you area has both spare and dense sub-areas? Here it is the time to use other spatial indexing mechanisms like R-Tree, KD-Tree or Quad-Tree!

What is Quad-Tree?

Quad Tree method starts with a big tile that covers whole study area,and divides it by two horizontal and vertical lines to have four equal area which are new tiles and then inspect each tile to see if it has more than a pre-defined threshold, points in it. in this case the tile will be divided into four equal parts using a horizontal and a vertical dividing lines,again. The process continues till there would be no more tile with number of points bigger than threshold, which is a recursive algorithm.

So in denser areas we have smaller tiles and big tiles when having spare points.

Quad-Tree

What is KD-Tree? In KD-Tree, we divide an area if it has more than a threshold points in it (other criterias can be used) dividing is done using a (K-1) dimension geometry, for example in a 3D-Tree we need a plane to divide the space, and in a 2D-Tree we need a line to divide the area. Dividing geometry is iterative and cyclic, for example in a 3D-Tree, the first splitting plane is a X axis aligned plane and the next dividing plane is Y axis aligned and the next is Z, the cycle continue for each space parts to become acceptable(satisfy the criteria)

The following picture shows a balanced KD-Tree that each dividing line is a median, that divides an area into two sub-area with approximately equal number of points.

2D-Tree

Conclusion : If you have a well-distributed points which is not the case when talking about structural features of earth in a map, cause they are random, but is acceptable when we plan to store a city roads network. I would go for a Grid indexing.

If you have a limited resource(i.e. Car navigation Systems) you need to implement KD-Tree or Quad-Tree. Each has its own pros and cons.

  1. Quad-Tree creates a lot of empty sub-tiles because each tile has to be divided into four parts even if the whole data of our tile can be fit in one quarter, so the rest of sub-tiles are considered redundant.(take a look at the Quad-Tree picture at above)
  2. Quad-Tree has much more easier index and can be implemented more easier. Accessing a tile with a Tile ID does not need a recursive function.
  3. In a 2 dimensional Kd-Tree, each node has just two children nodes or has no child at all, so searching through the KD-Tree is a binary search in nature.
  4. Updating Quad-Tree is much more easier than updating a balanced KD-Tree.

With the above descriptions I recommend to start with Quad-Tree

Here it is a sample code for quad-tree that intend to create 5000 random point.

#include<stdio.h>
#include<stdlib.h>
//Removed windows-specific header and functions

//-------------------------------------
// STRUCTURES
//-------------------------------------
struct Point
{
    int x;
    int y;
};


struct Node
{
    int posX;
    int posY;
    int width;
    int height;
    Node *child[4];         //Changed to Node *child[4] rather than Node ** child[4]
    Point pointArray[5000];
};
//-------------------------------------
// DEFINITIONS
//-------------------------------------

void BuildQuadTree(Node *n);
void PrintQuadTree(Node *n, int depth = 0);
void DeleteQuadTree(Node *n);
Node *BuildNode(Node *n, Node  *nParent, int index);

//-------------------------------------
// FUNCTIONS
//-------------------------------------

void setnode(Node *xy,int x, int y, int w, int h)
{
    int i;
    xy->posX = x;
    xy->posY = y;
    xy->width= w;
    xy->height= h;
    for(i=0;i<5000;i++)
    {
        xy->pointArray[i].x=560;
        xy->pointArray[i].y=560;
    }
    //Initialises child-nodes to NULL - better safe than sorry
    for (int i = 0; i < 4; i++)
        xy->child[i] = NULL;
}
int randn()
{
    int a;
    a=rand()%501;
    return a;
}

int pointArray_size(Node *n)
{
    int m = 0,i;
    for (i = 0;i<=5000; i++)
        if(n->pointArray[i].x <= 500 && n->pointArray[i].y <= 500)
            m++;
    return (m + 1);
}
//-------------------------------------
// MAIN
//-------------------------------------
int main()
{
    // Initialize the root node
    Node * rootNode = new Node;     //Initialised node
    int i, x[5000],y[5000];
    FILE *fp;
    setnode(rootNode,0, 0, 500, 500);


    // WRITE THE RANDOM POINT FILE  
    fp = fopen("POINT.C","w");

    if ( fp == NULL )
    {
        puts ( "Cannot open file" );
        exit(1);
    }
    for(i=0;i<5000;i++)
    {
        x[i]=randn();
        y[i]=randn();
        fprintf(fp,"%d,%d\n",x[i],y[i]);
    }
    fclose(fp);

    // READ THE RANDOM POINT FILE AND ASSIGN TO ROOT Node
    fp=fopen("POINT.C","r");
    for(i=0;i<5000;i++)
    {
        if(fscanf(fp,"%d,%d",&x[i],&y[i]) != EOF)
        {
            rootNode->pointArray[i].x=x[i];
            rootNode->pointArray[i].y=y[i];
        }
    }

    fclose(fp);

    // Create the quadTree
    BuildQuadTree(rootNode);
    PrintQuadTree(rootNode);    //Added function to print for easier debugging
    DeleteQuadTree(rootNode);

    return 0;
}

//-------------------------------------
// BUILD QUAD TREE
//-------------------------------------
void BuildQuadTree(Node *n)
{
    Node * nodeIn = new Node;   //Initialised node

    int points = pointArray_size(n);

    if(points > 100)
    {
        for(int k =0; k < 4; k++)
        {
            n->child[k] = new Node;     //Initialised node
            nodeIn = BuildNode(n->child[k], n, k);
            BuildQuadTree(nodeIn);
        }
    }
}
//-------------------------------------
// PRINT QUAD TREE
//-------------------------------------
void PrintQuadTree(Node *n, int depth)
{
    for (int i = 0; i < depth; i++)
        printf("\t");

    if (n->child[0] == NULL)
    {
        int points = pointArray_size(n);
        printf("Points: %d\n", points);
        return;
    }
    else if (n->child[0] != NULL)
    {
        printf("Children:\n");
        for (int i = 0; i < 4; i++)
            PrintQuadTree(n->child[i], depth + 1);
        return;
    }
}
//-------------------------------------
// DELETE QUAD TREE
//-------------------------------------
void DeleteQuadTree(Node *n)
{
    if (n->child[0] == NULL)
    {
        delete n;
        return;
    }
    else if (n->child[0] != NULL)
    {
        for (int i = 0; i < 4; i++)
            DeleteQuadTree(n->child[i]);
        return;
    }
}
//-------------------------------------
// BUILD NODE
//-------------------------------------
Node *BuildNode(Node *n, Node *nParent, int index)
{
    int numParentPoints, i,j = 0;

    // 1) Creates the bounding box for the node
    // 2) Determines which points lie within the box

    /*
     Position of the child node, based on index (0-3), is determined in this order:
     | 1 | 0 |
     | 2 | 3 |
     */

    setnode(n, 0, 0, 0, 0);

    switch(index)
    {
        case 0: // NE

            n->posX = nParent->posX+nParent->width/2;
            n->posY = nParent->posY+nParent->height/2;
            break;

        case 1: // NW

            n->posX = nParent->posX;
            n->posY = nParent->posY+nParent->height/2;
            break;

        case 2: // SW

            n->posX = nParent->posX;
            n->posY = nParent->posY;
            break;

        case 3: // SE

            n->posX = nParent->posX+nParent->width/2;
            n->posY = nParent->posY;
            break;

    }

    // Width and height of the child node is simply 1/2 of the parent node's width and height
    n->width = nParent->width/2;
    n->height = nParent->height/2;

    // The points within the child node are also based on the index, similiarily to the position
    numParentPoints = pointArray_size(nParent);

    switch(index)
    {
        case 0: // NE
            for(i = 0; i < numParentPoints-1; i++)
            {
                // Check all parent points and determine if it is in the top right quadrant
                if(nParent->pointArray[i].x<=500 && nParent->pointArray[i].x > nParent->posX+nParent->width/2 && nParent->pointArray[i].y > nParent->posY + nParent->height/2 && nParent->pointArray[i].x <= nParent->posX + nParent->width && nParent->pointArray[i].y <= nParent->posY + nParent-> height)
                {
                    // Add the point to the child node's point array
                    n->pointArray[j].x = nParent ->pointArray[i].x;
                    n->pointArray[j].y = nParent ->pointArray[i].y;
                    j++;
                }
            }
            break;
        case 1: // NW
            for(i = 0; i < numParentPoints-1; i++)
            {
                // Check all parent points and determine if it is in the top left quadrant
                if(nParent->pointArray[i].x<=500 && nParent->pointArray[i].x > nParent->posX && nParent->pointArray[i].y > nParent->posY+ nParent-> height/2 && nParent->pointArray[i].x <= nParent->posX + nParent->width/2 && nParent->pointArray[i].y <= nParent->posY + nParent->height)
                {
                    // Add the point to the child node's point array
                    n->pointArray[j].x = nParent ->pointArray[i].x;
                    n->pointArray[j].y = nParent ->pointArray[i].y;
                    j++;
                }
            } 
            break;
        case 2: // SW
            for(i = 0; i < numParentPoints-1; i++)
            {
                // Check all parent points and determine if it is in the bottom left quadrant
                if(nParent->pointArray[i].x<=500 && nParent->pointArray[i].x > nParent->posX && nParent->pointArray[i].y > nParent->posY && nParent->pointArray[i].x <= nParent->posX + nParent->width/2 && nParent->pointArray[i].y <= nParent->posY + nParent->height/2)
                {   
                    // Add the point to the child node's point array
                    n->pointArray[j].x = nParent ->pointArray[i].x;
                    n->pointArray[j].y = nParent ->pointArray[i].y;
                    j++;
                }
            }
            break;

        case 3: // SE
            for(i = 0; i < numParentPoints-1; i++)
            {
                // Check all parent points and determine if it is in the bottom right quadrant
                if(nParent->pointArray[i].x<=500 && nParent->pointArray[i].x > nParent->posX +  nParent->width/2 && nParent->pointArray[i].y > nParent->posY && nParent->pointArray[i].x <= nParent->posX + nParent->width && nParent->pointArray[i].y <= nParent->posY + nParent->height/2)
                {
                    // Add the point to the child node's point array
                    n->pointArray[j].x = nParent ->pointArray[i].x;
                    n->pointArray[j].y = nParent ->pointArray[i].y;
                    j++;
                }
            }
            break;

    }
    return n;
}
Iman Nia
  • 2,255
  • 2
  • 15
  • 35
  • 1
    This is an excellent answer that I've already upvoted, but I couldn't fail noticing that the first example is not an R-Tree. Wikipedia shows something very different. Perhaps it could be a [grid file](https://en.wikipedia.org/wiki/Grid_file)? (Which I don't know what it is and I am still trying to find out...) – Stelios Adamantidis Jan 15 '20 at 10:21
  • 1
    @SteliosAdamantidis You are completely right, looks like Grid indexing is the correct term here. Thanks for your upvote and your comment. I am going to edit it for future seekers – Iman Nia Jan 16 '20 at 09:38
2

I'd argue that in this case a kd-tree would do better than a quadtree, since when using a quadtree, for finding a nearest neighbor, the closest object might be placed right on the other side of a division between nodes. Kd-trees on the other hand, allows implementation of very efficient nearest-neighbor search, although insertion and removal will be more difficult, while maintaining a balanced tree.

Nikolaj Dam Larsen
  • 5,455
  • 4
  • 32
  • 45
-1

There are a couple of logic errors:

for(i = 0; i <= numParentPoints-1; i++)
    return m;
Martin Brisiak
  • 3,872
  • 12
  • 37
  • 51