I have two tables of service providers, providers
and providers_clean
. providers
contains many thousands of providers with very poorly formatted data, providers_clean
only has a few providers which still exist in the 'dirty' table as well.
I want the system using this data to remain functional while the user is 'cleaning' the data up, so I'd like to be able to select all of the rows that have already been 'cleaned' and the rows that are still 'dirty' while excluding any 'dirty' results that have the same id as the 'clean' ones.
How can I select all of the providers from the providers_clean
table merged with all of the providers from the providers
table, and EXCLUDE the ones that have already been 'cleaned'
I've tried:
SELECT * FROM providers WHERE NOT EXISTS (SELECT * FROM providers_clean WHERE providers_clean.id = providers.id)
which gives me all of the 'dirty' results from providers
EXCLUDING the 'clean' ones, but how can I rewrite the query to now merge all of the 'clean' ones from providers_clean
?
Here's a visual representation of what I'm trying to do:
Clean Table
+----+-------------------+
| ID | Name |
+----+-------------------+
| 1 | Clean Provider 1 |
| 4 | Clean Provider 4 |
| 5 | Clean Provider 5 |
+----+-------------------+
Dirty Table
+----+------------------+
| ID | Name |
+----+------------------+
| 1 | Dirty Provider 1 |
| 2 | Dirty Provider 2 |
| 3 | Dirty Provider 3 |
| 4 | Dirty Provider 4 |
| 5 | Dirty Provider 5 |
+----+------------------+
Desired Result
+----+------------------+
| ID | Name |
+----+------------------+
| 1 | Clean Provider 1 |
| 2 | Dirty Provider 2 |
| 3 | Dirty Provider 3 |
| 4 | Clean Provider 4 |
| 5 | Clean Provider 5 |
+----+------------------+
Thanks
UPDATE
This is working, however, Is there a more efficient way to write this query?
SELECT providers.id AS id,
CASE
WHEN
providers_clean.id IS NOT NULL
THEN
providers_clean.provider_name
ELSE
providers.provider_name
END AS pname,
CASE
WHEN
providers_clean.id IS NOT NULL
THEN
providers_clean.phone
ELSE
providers.phone
END AS pphone,
CASE
WHEN
providers_clean.id IS NOT NULL
THEN
providers_clean.website
ELSE
providers.website
END AS pwebsite
FROM providers
LEFT JOIN providers_clean ON providers_clean.id = providers.id
ORDER BY providers.id asc