8

I need to construct a R tree using given data points.I have searched for implementation of R tree.All the implementation i found construct r tree when given coordinates of rectangle as input.I need to construct r tree when given data points itself(it can be 1 dimensional).The code should take care of creating rectangles which encloses these data points and construct r tree.

rajsekhar
  • 403
  • 1
  • 5
  • 14

3 Answers3

6

Use MBRs (Minimum bounding rectangle) with min = max = coordinate. They all do it this way. Good implementations will however store point data with approximately twice the capacity in the leaves than in directory nodes.

Cléssio Mendes
  • 996
  • 1
  • 9
  • 25
Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
  • Can to elaborate a bit? In most R-tree code, you would create a rectangle (with the lower-right and upper-right points) and add the rectangle to the R-tree. Are you saying that if we want to store a single point, then both points (lower-right and upper-right) should be that same single point? – stackoverflowuser2010 Sep 25 '13 at 17:39
  • The MBR of a single point has of course `min=max=coordinate`. By storing points instead of MBRs at the leaf level, we get roughly twice as many objects in there, which reduces disk space by almost 2. – Has QUIT--Anony-Mousse Sep 25 '13 at 17:43
  • Thanks. So by deciphering your answer, I assume that you must mean "yes, if we want to store a single point, then both points (lower-right and upper-right) should be that same single point". Is an R-tree better for storing points than other data structures, like a point-region quadtree? – stackoverflowuser2010 Sep 25 '13 at 17:56
  • Depends on your data and query and implementation. In my experiments, R-trees are excellent, and outperform e.g. k-d-trees easily because of the better fanout. Choosing the right page size is essential. As for quadtrees, R-trees have a better fanout, and are data adaptive, so if your data is clustered they will likely be better. – Has QUIT--Anony-Mousse Sep 25 '13 at 17:59
  • Will your approach work if I'm trying to store GPS points with 5-digit floating-point precision (e.g. 37.80995, -122.47735)? – stackoverflowuser2010 Sep 27 '13 at 20:38
  • Well, I'm not aware of a good 5 digit floating point binary encoding. But yes, you could use Integers, 32 bit integer precision should be a reasonable choice for GPS data. – Has QUIT--Anony-Mousse Sep 29 '13 at 13:45
2

If you're looking for a C++ implementation, the one contained in Boost.Geometry currently (Boost. 1.57) is able to store Points, Boxes and Segments. The obvious advantage is that the data in leafs of the tree is not duplicated which means that less memory is used, caching is better, etc. The usage looks like this:

#include <boost/geometry.hpp>
#include <boost/geometry/geometries/geometries.hpp>
#include <boost/geometry/index/rtree.hpp>

#include <vector>

namespace bg = boost::geometry;
namespace bgi = boost::geometry::index;

int main()
{
    typedef bg::model::point<float, 2, bg::cs::cartesian> point;
    typedef bg::model::box<point> box;

    // a container of points
    std::vector<point> points;

    // create the rtree
    bgi::rtree< point, bgi::linear<16> > rtree(points.begin(), points.end());

    // insert some additional points
    rtree.insert(point(/*...*/));
    rtree.insert(point(/*...*/));

    // find points intersecting a box
    std::vector<point> query_result;
    rtree.query(bgi::intersects(box(/*...*/)), std::back_inserter(query_result));

    // do something with the result
}
Adam Wulkiewicz
  • 2,068
  • 18
  • 24
0

I guess that using an Rtree to store points seems like a misuse. Although this kind of structure is indicated to store spatial data, after some research I just found out it is best suited for storing non-zero area regions (as the R from the name is for Region or Rectangle). Creating a simple table with a nice index should offer better performance either for updating and searching data. Consider my example below:

CREATE TABLE locations (id, latitude, longitude);
CREATE INDEX idx_locations ON locations (latitude, longitude);

is preferable over

CREATE VIRTUAL TABLE locations USING rtree( id, minLatitude, maxLatitude, minLongitude, maxLongitude);

if you are just planning to repeat minLatitude over maxLatitude and minLongitude over maxLongitude for every row as to represent points and not rectangles. Although the latter approach will work as expected, Rtrees are suited to index rectangle areas and using them to store points is a misuse with worst performance. Prefer a compound index as above.

Further reading: http://www.deepdyve.com/lp/acm/r-trees-a-dynamic-index-structure-for-spatial-searching-ZH0iLI4kb0?key=acm

Cléssio Mendes
  • 996
  • 1
  • 9
  • 25
  • 1
    Can't agree. I needed points-only indexes and R-tree queries were faster than compound index (SQLite). Spatiality is not only about data, but also about queries: if you want to know all the points (data) contained within a rectangle (query), try R-trees. – pwes May 30 '13 at 21:15