0

We have two customer related tables below and a query with red rows for the undesired results. We are trying to get the results for each customer in customer_data to include their respective latest phone_number (from the customer_phones table based on theupdated field) that is default and not deleted (shown in query below). The customer_data.customer_id is the same as customer_phones.user_id and where we do the left join on.

1) DB table name/data

customer_data

enter image description here

customer_phones

enter image description here

2) Query

SELECT cd.id, cd.customer_id, cd.customer_name, cp.phone_number, cp.updated
FROM customer_data cd
LEFT JOIN customer_phones cp ON cp.user_id=cd.customer_id 
      AND cp.is_default_phone='1' AND cp.deleted IS NULL
ORDER BY cd.id

3) Results (with undesired rows in red)

enter image description here

The red rows are undesired since each customer should only have one phone number that is both default and updated latest (in case they have more than one default number the one with the latest updated would be the correct one to pull into the results).

We expect these table to have a lot of fields in a few months and want to make sure we don't have a solution with an n+1 issue if using a subquery of some sort. So what would be the most efficient way to query the tables to get the desired results above but without the red undesired rows?

Wonka
  • 8,244
  • 21
  • 73
  • 121

1 Answers1

1

You have to set one more condition in the ON clause to get the latest updated row only:

SELECT cd.id, cd.customer_id, cd.customer_name, cp.phone_number, cp.updated
FROM customer_data cd
LEFT JOIN customer_phones cp 
ON cp.user_id=cd.customer_id AND cp.is_default_phone='1' AND cp.deleted IS NULL AND
  cp.updated = (
    select max(updated) from customer_phones 
    where user_id=cd.customer_id AND is_default_phone='1' AND deleted IS NULL
  )
ORDER BY cd.id
forpas
  • 160,666
  • 10
  • 38
  • 76
  • Very nice answer, I never thought of adding one more condition of cp.updated = subquery. Thank you so much for this, it works really well! – Wonka May 24 '19 at 01:00
  • Not that it should happen, but hypothetically, if the updated timestamps are exactly the same for Adam's first 2 results, the subquery select max(update) would return 2 rows instead of one for Adam. Is there a way to ensure only one row would ever be be returned per user (if both update timestamps are the same, say having a secondary unique field such as the highest id) so the results will never have a duplicated user row instance? – Wonka May 24 '19 at 01:38
  • *select max(update) would return 2 rows instead of one for Adam* only if there are 2 rows with that date both set to `is_default_phone='1'` and `cp.deleted IS NULL`. Is there such a case? – forpas May 24 '19 at 08:40
  • I don't think there would be such a case and your query works perfectly, it was just for learning purposes to see if it was possible to only return one result per user no matter what, even if the `updated` was the same. I think what we have here will suffice though, I don't want to complicate it further. I was just curios :) – Wonka May 24 '19 at 20:45