0

I have a Hive table with many thousands of points. The only columns are latitude|longitude. I know in advance that these points are all in a certain area, and the extreme outer edge of the points does form a continuous polygon, but many of the points are interior. I'm trying to determine which points are the external convex hull for visualization. I don't want to use all points, because it has messy interior holes that don't look good on a visualization. I'm using hive-1.2.1000.2.4.2.0. Here's what I tried:

hive> add jar /home/me/gis-tools-for-hadoop/samples/lib/esri-geometry-api.jar;
Added [/home/me/gis-tools-for-hadoop/samples/lib/esri-geometry-api.jar] to class path
Added resources: [/home/me/gis-tools-for-hadoop/samples/lib/esri-geometry-api.jar]
hive> add jar /home/me/gis-tools-for-hadoop/samples/lib/spatial-sdk-hadoop.jar;
Added [/home/me/gis-tools-for-hadoop/samples/lib/spatial-sdk-hadoop.jar] to class path
Added resources: [/home/me/gis-tools-for-hadoop/samples/lib/spatial-sdk-hadoop.jar]
hive> create temporary function ST_ConvexHull AS 'com.esri.hadoop.hive.ST_ConvexHull';
OK
Time taken: 0.014 seconds
hive> create temporary function ST_AsText AS 'com.esri.hadoop.hive.ST_AsText';
OK
Time taken: 0.009 seconds
hive> create temporary function ST_Point AS 'com.esri.hadoop.hive.ST_Point';
OK
Time taken: 0.009 seconds
hive> SELECT ST_AsText(ST_ConvexHull(ST_Point(latitude, longitude))) FROM sandbox11.cnst_zn;

I have also tried flipping latitude and longitude order in my query. In both instances, I get 'MULTIPOLYGON EMPTY' as the response. The documentation of the UDF is here: https://github.com/Esri/spatial-framework-for-hadoop/wiki/UDF-Operations#st_convexhull

AJD
  • 70
  • 8
  • You may find some of the ideas on [this question](http://stackoverflow.com/q/41268547/752843) to be helpful. – Richard Dec 28 '16 at 20:31

1 Answers1

1

If you want the convex hull of the geometries of all the multiple records of a table, use ST_Aggr_ConvexHull rather than ST_ConvexHull (which expects a list of multiple geometries from a single row).

[collaborator]

Update: the syntax for aggregate ConvexHull would be similar to the syntax for aggregate Union, for which we have an example in an article.

Randall Whitman
  • 411
  • 2
  • 13
  • Hi Randall Whitman, thanks as always for the help. What is this UDF expecting? I have 2 columns, latitude, longitude. Should I concat with a comma, or reverse order or anything like that? – AJD Dec 28 '16 at 21:02
  • You would create each point with ST_Point, like ST_Point(longitude, latitude). – Randall Whitman Dec 28 '16 at 21:05