0

I have two tables, one is a table of employee names, 176 records. The other is a table (with duplicates) of employee names (same format) and their locations (7943 rows).

From this answer i deduced i needed a left join to give me the rows from Table A only.

I was hoping the below would give me the original 176 rows back from Table A, each column with a value for location from Table B, else blank if not available, however it gives me 7601 rows which i cannot for the life of me understand:

SELECT e.[UniqueName], l.[location]
FROM [Employees] as e

left join Locations as l
on e.[UniqueName] = l.[UniqueName]

Even using a group by (which I'm not sure why this would be necessary given that I am asking only for whats in Table A) gives 172 rows even though each name in the Employees table is unique!

andre1990
  • 107
  • 2
  • 9

3 Answers3

1

The table Locations contains more than 1 locations for each employee and this is why you get so many rows in the results.
If you want just 1 location and it does not matter which 1 then add aggregation to your query:

SELECT e.[UniqueName], MAX(l.[location]) AS location
FROM [Employees] as e
LEFT JOIN Locations as l
ON e.[UniqueName] = l.[UniqueName]
GROUP BY e.[UniqueName]

You can use MIN() instead of MAX().

forpas
  • 160,666
  • 10
  • 38
  • 76
1

I was hoping the below would give me the original 176 rows back from Table A, each column with a value for location from Table B, else blank if not available, however it gives me 7601 rows which I cannot for the life of me understand...

Whilst a left join will always return all records from the dataset on the lefthand side of the join, the number of records returned by the query will depend upon the number of possible pairings between the two datasets, which (for a left join) will always be greater than or equal to the number of records in the dataset to the left of the join.

For your example, consider the following two datasets:

Employees

+------------+
| UniqueName |
+------------+
| Alice      |
| Bob        |
| Charlie    |
+------------+

Locations

+------------+----------+
| UniqueName | Location |
+------------+----------+
| Alice      | London   |
| Bob        | Berlin   |
| Bob        | New York |
| Bob        | Paris    |
+------------+----------+

Evaluating the query:

select 
    e.[uniquename], l.[location]
from 
    [employees] as e left join locations as l 
    on e.[uniquename] = l.[uniquename]

Will cause the records to be paired up in the following manner:

enter image description here

And will therefore return the result:

+------------+----------+
| uniquename | location |
+------------+----------+
| Alice      | London   |
| Bob        | Berlin   |
| Bob        | New York |
| Bob        | Paris    |
| Charlie    |          |
+------------+----------+
Lee Mac
  • 15,615
  • 6
  • 32
  • 80
0

You can use a correlated subquery:

SELECT e.[UniqueName],
       (SELECT TOP 1 l.[location]
        FROM locations as l
        WHERE e.[UniqueName] = l.[UniqueName]
       ) as location
FROM [Employees] as e;

Note: There is no ORDER BY so this returns an arbitrary location.

If location can be duplicated for a given UniqueName, you will get an error. To solve that, you can use an aggregation functions:

SELECT e.[UniqueName],
       (SELECT MAX(l.[location])
        FROM locations as l
        WHERE e.[UniqueName] = l.[UniqueName]
       ) as location
FROM [Employees] as e;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Hi Gordon. The above seems dependent on the Name existing in both tables (as i asked), but on reflection i'm wondering how would you search all Names in Table A on Table B, returning a location where a name match is found, but also returning the row with a blank location field where a match isn't found? – andre1990 Jan 21 '20 at 19:37
  • @andre1990 . . . Not at all. If the name is not in `locations`, then the subquery returns `NULL`. – Gordon Linoff Jan 21 '20 at 22:02