0

We have user reviews of products that have following things:

  1. Review description
  2. Date on which review was written
  3. City from where review was written

We have two tables here:

1) Reviews (Columns: Description (varchar), date (DateTime), CityId (int))

2) Cities (Columns : Id (int), Name (varchar), Lat, Long)

We want to sort these user reviews such that if input is any given city, let's say city 'C', the output sort order should be:

  • Reviews written for cities within 0 to 'X' kilometres of city 'C', followed by
  • Reviews written for cities in 'X' to '2X' kilometres of city 'C', followed by
  • Reviews written for cities in '2X' to '3X' kilometres and so on till the cities end.

One simple approach to solve this is:Just do it on runtime and fetch cities in these kilometre ranges one by one and then review for those but it would be too slow. Also, we will do same operation multiple times. To avoid recomputation we can save the information of nearby bucket city for each city but not sure if that's ideal.

Is there any simpler and more efficient way to solve this i.e. by kat/long or any other approach?

Sahil Sharma
  • 3,847
  • 6
  • 48
  • 98
  • Please include the list of tables with their columns. Ideally a http://sqlfiddle.com/ . It's certainly possible to retrieve the reviews the way you want. However, the appropriate indexes need to be put in place to ensure a decent perfomance (most likely this is an online query). – The Impaler Aug 14 '19 at 01:51
  • Done. Please check if you can help now. – Sahil Sharma Aug 14 '19 at 05:40

1 Answers1

1

You can retrieve the reviews sorted by distance by using the Haversine Formula; see Haversine implementation to write a MySQL function that can be used directly in your query.

In the query below (also in SQL Fiddle) I used a simpler formula to illustrate how the query will work using "Los Angeles" (789) as the given city:

select *
from (
  select
    r.*,
    sqrt(power(g.lat - c.lat, 2) + power(g.lng - c.lng, 2)) as dist -- simple dist
  from reviews r
  join cities c on c.id = r.cityid -- city the review belongs to
  join cities g on g.id = 789 -- given city as input
) x
order by dist

Note the table cities is joined twice: first as the city the review belongs to, then as the given [base] city. This allow us to compute the distance.

Result:

description  recorded             cityid  dist   
-----------  -------------------  ------  -------
review 4     2019-01-01 07:34:56  789     0      
review 2     2019-01-01 07:34:56  456     5.65685
review 3     2019-01-01 07:34:56  456     5.65685
review 1     2019-01-01 07:34:56  123     6.32456

For reference, this is the data script I used:

create table reviews (
  description varchar(100),
  recorded datetime,
  cityid int
);

insert into reviews (description, recorded, cityid) values 
  ('review 1', '2019-01-01 12:34:56', 123),
  ('review 2', '2019-01-01 12:34:56', 456),
  ('review 3', '2019-01-01 12:34:56', 456),
  ('review 4', '2019-01-01 12:34:56', 789);

create table cities (
  id int,
  name varchar(20),
  lat double,
  lng double
);

insert into cities (id, name, lat, lng) values 
  (123, 'Chicago', 10, 11),
  (456, 'Indianapolis', 12, 9),  
  (789, 'Los Angeles', 8, 5);
The Impaler
  • 45,731
  • 9
  • 39
  • 76