I'm having the following table layout: four different tables, each containing around 10 to 15 million entries. Three string attributes of each table are the same (let's call them Id, Name1, Name2). Now we want to read all entries having the same Id column but different (Name1,Name2) tuples. It is estimated that less than 0.5 % of all entries are matching.
We've created a view AllEntries (basically a UNION ALL of the relevant attributes over all four tables) and our query looks like this:
SELECT *
FROM AllEntries
GROUP BY Id
HAVING COUNT(DISTINCT(Name1)) > 1 OR COUNT(DISTINCT(Name2)) > 1
Executing the query in our test database with 2 million entries in each table (i.e. 8 million entries in the view) already takes around 2 to 3 minutes (nice server).
Q: Is there any performance improvement possible to improve the performance?