How to fix an "ambigous column name error using inner join" error

Question

I am taking a query from a database, using two tables and am getting the error described in the title of my question. In some cases, the field I need to query by is in table A, but others are in table B. I dynamically create columns to search for (which can either be in table A or table B) and my WHERE clause in my code is causing the error.

Is there a dynamic way to fix this, such as if column is in table B then search using table B, or does the INNER JOIN supposed to fix this (which it currently isn't)

Table A fields: id

Table B fields: id

SQL code

SELECT *
FROM A INNER JOIN B ON A.id = B.id
WHERE 
<cfloop from="1" to="#listLen(selectList1)#" index="i">

    #ListGetAt(selectList1, i)# LIKE UPPER(<cfqueryparam cfsqltype="cf_sql_varchar" value="%#ListGetAt(selectList2,i)#%" />) <!---
                                                    search column name = query parameter

                                                    using the same index in both lists
                                                    (selectList1) (selectList2) --->
    <cfif i neq listLen(selectList1)>AND</cfif> <!---append an "AND" if we are on any but
                                                the very last element of the list (in that
                                                case we don't need an "AND"--->
</cfloop>

Question posed here too

I would like to be able to search any additional fields in both table A and table B with the id column as the data that links the two.

Are you dynamically adding SQL statements? If so, are all the columns added proceeded by the table aliases, IE. A.column, B.column? — BClaydon, Jun 10 '13 at 18:14

BClaydon · Accepted Answer · 2013-06-12T16:35:06.273

6

Employee
------------------
Emp_ID  Emp_Name    Emp_DOB Emp_Hire_Date   Emp_Supervisor_ID


Sales_Data
------------------
Check_ID    Tender_Amt  Closed_DateTime Emp_ID

Every column you reference should be proceeded by the table alias (but you already knew that.) For instance;

SELECT E.Emp_ID, B.Check_ID, B.Closed_DateTime
FROM Employee E 
    INNER JOIN Sales_Data SD ON E.Emp_ID = SD.Emp_ID

However, when you select all (*) it tries to get all columns from both tables. Let's see what that would look like:

SELECT *
FROM Employee E 
    INNER JOIN Sales_Data SD ON E.Emp_ID = SD.Emp_ID

The compiler sees this as:

**Emp_ID**, Emp_Name, Emp_DOB, Emp_Hire_Date, Emp_Supervisor_ID, 
Check_ID, Tender_Amt, Closed_DateTime, **Emp_ID**

Since it tries to get all columns from both tables Emp_ID is duplicated, but SQL doesn't know which Emp_ID comes from which table, so you get the "ambiguous column name error using inner join".

So, you can't use (*) because whatever column names that exist in both tables will be ambiguous. Odds are you don't want all columns anyway.

In addition, if you are adding any columns to your SELECT line via your cfloop they must be proceed by the table alias as well.

--Edit: I cleaned up the examples and changed "SELECT * pulls all columns from the first table" to "SELECT * pulls all columns from both tables". Shawn pointed out I was incorrect.

edited Jun 12 '13 at 16:35

answered Jun 10 '13 at 18:06

BClaydon

1,900
2
20
34

The * actually references all columns within the query doesn't it? Though you can of course do `A.*` or `B.*`. Generally preferable though to name the columns you need for clarity. – Simon at The Access Group Jun 10 '13 at 21:50
The * does refer to all columns in the query. The query above produces two ID columns. And any other columns with the same name in both tables will be duplicated. That's why there's an error. If you try to sort by any of those columns, without providing a table alias, SQL won't be able to tell which column you're referring to. Table aliases and referring to them in the WHERE clause by the alias should eliminate this problem. – Shawn Jun 12 '13 at 15:36
And I agree with Simon. SELECT * is usually not good. It's more typing, but list out the columns that you actually need to return in your results. And only those columns. If you only need two or three columns from each table, you don't need to select every column in both tables. It will make your query much more efficient and readable. – Shawn Jun 12 '13 at 15:38
Thanks Shawn. I was wrong, * will get ALL columns from both sides of the join. Duh! I verified in my lab this morning. I agree you don't _have_ to add aliases, but you definitely _should_. I will update my post now. – BClaydon Jun 12 '13 at 16:27

score 2 · Answer 2 · answered Jun 10 '13 at 17:09

2

You have to write your where clause in such a way that you can say A.field_from_A or B.field_from_B. You can always pass A.field_from_A.

Although, you don't really want to say

SELECT * FROM A INNER JOIN B ON A.id=B.id where B.id = '1'.

You would want to say

SELECT * FROM B INNER JOIN A ON B.id=A.id where B.id = '1'

You can get some really slow queries if you try to use a joined table in the where clause. There are times when it's unavoidable, but best practice is to always have your where clause only call from the main table.

answered Jun 10 '13 at 17:09

thomasw_lrd

534
3
13

Got a reference for that last statement you make? It sounds interesting & I'd like to read up on it. Cheers. – Adam Cameron Jun 10 '13 at 21:49
I've seen no references in this but anecdotally have seen some minor performance issues when I put all the limiting criteria within the WHERE compared to placing join-specific limiting criteria into the specific JOINs. This may of course be MySQL being peculiar in these examples when it comes to optimisation. Unfortunately I have no sources to confirm one way or the other. – Simon at The Access Group Jun 10 '13 at 21:53
My observations are that referring to a joined table in the where clause does not affect performance. However, I don't use MySQL. – Dan Bracuk Jun 10 '13 at 23:58
My use case is in Progress SQL, I've built bad queries that would take 27 seconds to run with several joins, but by changing the joins, I've managed to make the query's run in half the time. MYSQL 5.6 has a new join optimizer that may make my statement irrelevant. – thomasw_lrd Jun 11 '13 at 14:09
@thomasw_lrd - Yes, my impression is it all depends on the intelligence of the db's optimizer. I remember in in old versions of MS SQL, you could often improve execution time simply by switching the order of joins or `where` clause conditions. But later versions were smart enough to treat them the same, regardless of order. – Leigh Jun 11 '13 at 14:22
I'd agree with Leigh. Most db optimizers are smart enough to do the joins most efficiently. Essentially, there's nothing different about the two joins above, since they are INNER JOINs. But I've always found it better to keep the JOIN criteria together, for consistency. INNERs don't make a difference, but OUTERs will. "SELECT * FROM A LEFT OUTER JOIN B ON A.id=B.id WHERE B.id = '1'" can return MUCH different results than "...LEFT OUTER JOIN B ON A.id=B.id AND B.id = '1'". – Shawn Jun 12 '13 at 15:27

score 1 · Answer 3 · answered Jun 11 '13 at 00:05

1

When you assemble your select1 variable, add the tablenames or aliases. In other words, instead of resembling this:

select1 = "fred,barney,wilma,pebbles";

Make it resemble this:

select1 = "a.fred,a.barney,b.wilma,b.pebbles";

answered Jun 11 '13 at 00:05

Dan Bracuk

20,699
4
26
43

How to fix an "ambigous column name error using inner join" error

3 Answers3

Linked