0

I am using Java 8 with JPA and Hibernate 5.2.1 and MySQL database.

I have a table with users GPS coordinates:

Row 1:

LAT_1: 28.85, LONG_1: -25.766667
LAT_2: 27.15, LONG_2: -26.133333

Row 2:

LAT_1: 27.25, LONG_1: -26.234567
LAT_2: 28.98, LONG_2: -25.456666

Then I have a user selected value of coordinates too:

LAT_USER: 27.22, LONG_USER: -26.89998

I need the result set in from the database query to return rows ordered by distance from the closest GPS point (asc).

In this example, the closest GPS point to LAT_USER & LONG_USER will be:

For Row 1:

 LAT_2 & LONG_2

For Row 2:

 LAT_1 & LONG_1

Is this possible?

Here is my base code so far:

    CriteriaBuilder criteriaBuilder = entityManager.getCriteriaBuilder();
    CriteriaQuery<T> criteriaQuery = criteriaBuilder.createQuery(entityClass);

    // from
    Root<T> from = criteriaQuery.from(entityClass);
    criteriaQuery.select(from);

    // order by
    List<Order> order = new ArrayList<Order>();
    // ????
    criteriaQuery.orderBy(order);

    List<T> results = (List<T>) entityManager.createQuery(criteriaQuery).setFirstResult(firstResult)
            .setMaxResults(maxResults).getResultList();
    return results;

I would appreciate anyone's help.

Thanks

UPDATE:

My table structure is as follows (I have other join tables, but this is for simplicity sake):

 ----------     -------------------    -----------
| EMPLOYEE |   | EMPLOYEE_LOCATION |  | LOCATION  |
 ----------     -------------------    -----------
|ID        |   |EMP_ID             |  | ID        |
|          |   |LOC_ID             |  | LATITUDE  |
|          |   |                   |  | LONGITUDE |
 ----------     -------------------    -----------

My Employee Model Object has locations:

@OneToMany(cascade = CascadeType.ALL, fetch=FetchType.EAGER)
@JoinTable
(
    name="employee_location",
    joinColumns={ @JoinColumn(name="EMP_ID", referencedColumnName="ID") },
    inverseJoinColumns={ @JoinColumn(name="LOC_ID", referencedColumnName="ID", unique=true) }
)
private Set<Location> locations;

I understand to achieve what I need, I need to do the Haversine formula calculation on the database because I need to order the result set with the calculations result.

I know this will result in a pretty complicated query because it needs to find the lowest distance, and order by that.

    SELECT e.id, 
    ( 6371 * acos( cos( radians(37) ) * cos( radians( l.LATITUDE) ) * cos( radians( l.LONGITUDE) - radians(-122) ) + sin( radians(37) ) * sin( radians( l.LATITUDE) ) ) ) AS distance
    FROM www.employee AS e 
    inner join
        www.employee_location as el
            on e.id=el.EMP_ID 
    inner join
        www.location as l
            on el.LOC_ID=l.ID
order by DISTANCE

This returns the ID and the DISTANCE correctly, however it returns 3 EMPLOYEES because there are 3 LOCATIONS for that EMPLOYEE. When it is wired in JPA (as above), this returns 1 EMPLOYEE with 3 LOCATIONS.

Does anyone know how I could preferably keep this in JPA objects and add custom SQL for the Haversine formula (much simpler because I have other joins. Or is this not possible?). Then, I need to order by the distance of the lowest LOCATION for that EMPLOYEE.

UPDATE

I will use something like this:

select ID, min(distance) as mindistance from (
select e.ID,
( 6371 * acos( cos( radians(37) ) * cos( radians( o.LATITUDE) ) * cos( radians( o.LONGITUDE) - radians(-122) ) + sin( radians(37) ) * sin( radians( o.LATITUDE) ) ) ) AS distance
from
    www.employee as e
inner join
    www.employee_location as eo
        on e.id=eo.EMP_ID 
inner join
    www.location as o
        on eo.LOC_ID=o.ID ) xxx
group by ID
order by mindistance; 
Community
  • 1
  • 1
Richard
  • 8,193
  • 28
  • 107
  • 228
  • The formula you are probably looking for here is something called the _Haversine_ formula, which computes the distance between two points on a sphere. You might want to do this calculation in your database, and you also might want to spatially partition your database. – Tim Biegeleisen Jul 26 '16 at 06:19
  • Thanks Tim. I agree, I need the calculation on the database, because it is doing the order by. I am trying to use SQL/JPQL/HQL to do this calculation. The difficulty I am having is for each row to know which coordinates to use, as well as the calculation itself. I am currently working on it, and may hopefully get some help here too. – Richard Jul 26 '16 at 06:31
  • Have a look [here](http://stackoverflow.com/questions/574691/mysql-great-circle-distance-haversine-formula) for a discussion of the Haversine formula. – Tim Biegeleisen Jul 26 '16 at 06:32
  • Thanks, having a look.... – Richard Jul 26 '16 at 06:36
  • Looks very comprehensive. I can definitely make use of the Haversine formula, exactly what I need. I just need to modify my SQL to order my the closest coordinate - appreciate you help. – Richard Jul 26 '16 at 07:00
  • Also keep in mind that you may want to look into spatial partitioning of your database if you expect to have a very large number of coordinates. MySQL doesn't support function indices, so you would have to call Haversine on each record to determine the matching record(s). – Tim Biegeleisen Jul 26 '16 at 07:06
  • Thanks Tim, I am no SQL expert, but will do some reading about spatial partitioning, as I do want this as performant as possible. – Richard Jul 26 '16 at 07:35
  • I think you only need to worry about the partition if you have many records (e.g. thousands and higher) and you expect frequent traffic. For just a pet problem, maybe you can eschew this. – Tim Biegeleisen Jul 26 '16 at 07:38
  • Okay thanks Tim. I have added my SQL above, if you have a sec, could you please have a look. – Richard Jul 26 '16 at 07:40
  • Please show your actual table structure, this is not too clear from your question. – Tim Biegeleisen Jul 26 '16 at 07:51
  • Thanks Tim. Structure added above – Richard Jul 26 '16 at 08:03
  • I'm confused...why does an employee have multiple locations associated with him? – Tim Biegeleisen Jul 26 '16 at 08:06
  • They can choose to work in up to 5 locations. – Richard Jul 26 '16 at 08:09
  • So do you want to choose the closest point, for each employee, the entire result set being ordered by that closest point? – Tim Biegeleisen Jul 26 '16 at 08:11
  • 1
    Just a question : any reason why you didn't store the info using a Geometric column type (e.g JTS Point) ? then you would have geospatial methods available that do all of that – Neil Stockton Jul 26 '16 at 08:12
  • Yes Tim, that's correct. I need to order on the closest point only. – Richard Jul 26 '16 at 08:14
  • Hi Neil, I didn't know about Geometric column types. I will have a read. Thank you – Richard Jul 26 '16 at 08:15
  • Here's a doc link that DataNucleus JPA provides support for http://www.datanucleus.org/products/accessplatform_5_0/jpa/types_geospatial.html and JPQL functions http://www.datanucleus.org/products/accessplatform_5_0/jpa/jpql_geospatial_functions.html Maybe Hibernate has some support also? – Neil Stockton Jul 26 '16 at 08:16
  • Neil, I just had a read about Special data Types in MySQL. It looks like I should change my table to either use five POINT data types, or one MULTIPOINT containing each. What would you recommend in light of what I am trying to do in the query above? (I like MULTIPOINT, because it offers flexibility in the number you store) – Richard Jul 26 '16 at 08:30
  • I've never used it for something like your structure. In choosing what to use, just think of what you want to use the data for, and because there are Spatial functions that give distance etc ... https://dev.mysql.com/doc/refman/5.6/en/spatial-relation-functions-object-shapes.html just depends whether you can get the relevant info out of a MULTIPOINT to use as input for which method you need – Neil Stockton Jul 26 '16 at 08:44
  • I think a MULTIPOINT suits my needs. Thanks both Neil & Tim, I appreciate your help. – Richard Jul 26 '16 at 08:46
  • I have decided rather not to use a MULTIPOINT, it just adds added complications and I don't really need the functionality. It does however leave me with the issue where I need to solve the above problem. – Richard Jul 26 '16 at 13:47

1 Answers1

0

I think you have two options

  1. Use the spatial support in MySql
  2. Use the spatial support in hibernate-search (sits on top of lucene)

For option 1, it's likely you won't have hibernate support for this so I'm guessing you'll need Session.createSQLQuery(...) and/or Restrictions.sqlRestriction(...)

My personal preference is for option 1 as managing the lucene index can add complexity

lance-java
  • 25,497
  • 4
  • 59
  • 101
  • Thanks Lance, will have a look into that. – Richard Jul 26 '16 at 15:35
  • Lance, if I use spatial support, it holds the data correctly (i.e. as geometry points), but this still does not solve my problem I don't think? I need to be able to find the lowest of n points and order by that. – Richard Jul 26 '16 at 15:39
  • I, personally, think hibernate is one of the biggest time wasters in the java world. If not too late, I suggest you ditch hibernate all together and switch to [mybatis](http://www.mybatis.org/mybatis-3/) – lance-java Jul 26 '16 at 15:40
  • I'm guessing that's similar to iBatis. Less bloat than Hibernate. I will have a look, thanks. – Richard Jul 26 '16 at 15:43
  • I think you'll use `st_within` to limit the search and order by `st_distance`. Eg `select * from table where st_within(...) order by st_distance(...)`. Some info [here](https://www.percona.com/blog/2013/10/21/using-the-new-mysql-spatial-functions-5-6-for-geo-enabled-applications/) – lance-java Jul 26 '16 at 15:50
  • mybatis deals with pure sql. Every hibernate app I've ever seen performs hundreds (thousands) of unnecessary queries. Try [switching on sql logging](http://stackoverflow.com/questions/2536829/hibernate-show-real-sql) and I bet you see lots of rubbish that you wouldn't be doing if hand crafting the sql yourself. – lance-java Jul 26 '16 at 15:58
  • Thanks, having a look at the MySQL Spatial Functions – Richard Jul 26 '16 at 16:04
  • Looks like MySQL [Spatial Functions][https://www.percona.com/blog/2013/10/21/using-the-new-mysql-spatial-functions-5-6-for-geo-enabled-applications/] has limitations. For MySQL 5.6 Given those major limitations, it is not very easy to use st_distance function for the geo enabled applications. If we simply need to find a distance between 2 points it is easier to store lat, lon directly and use harvesine expression (as described above). – Richard Jul 26 '16 at 16:11
  • I am however using MySQL 5.5. A limitation of the environment I am using. So there is no st_dist support. I think I need to write native SQL using the harvesine expression. – Richard Jul 26 '16 at 16:13
  • Be careful with the where clause. It's common to do the calculation in two phases. First filter out anything not in the square (bounding box). Then use sin / cosine to narrow down to a circle. – lance-java Jul 26 '16 at 18:11
  • Thanks for your help. I have updated above with what I am going to use. – Richard Jul 26 '16 at 18:38
  • Ok, looks like you're not doing `where distance < x` so you can ignore the bounding box comments. If you start filtering by distance over thousands of records (eg find items 5km from me) you might need to implement a two-phase search using bounding box first for better performance – lance-java Jul 27 '16 at 08:14
  • Thanks. Nope I am not filtering. I do have a fetch size though and am using infinite scroll. So in theory the user can get a lot of data, but it's unlikely. – Richard Jul 28 '16 at 12:05