1

If I have two tables:

+-----------------+
+ people          |
+----+------+-----+
| id | name | age |
+----+------+-----+
|  1 | Joe  |  42 |
|  2 | Sue  |  30 |
+----+------+-----+

+-------------------+
+ employees         |
+----+------+-------+
| id | dept | hired |
+----+------+-------+
|  1 | HR   |  2015 |
|  2 | Dev  |  2013 |
+----+------+-------+

And I query for someone like so:

SELECT * FROM employees JOIN people ON employees.id=people.id WHERE id=2;

Why does this throw an error? I know id isn't unique, but I'm forcing it to be common since it MUST be shared due by the JOIN. I also know I can bypass this by asking SELECT *,people.id AS id FROM... but this is a question of theory, not looking for a solution.

Is this just a case of MySQL not being developed enough? Or is there a valid reason to throw this error?

Bing
  • 3,071
  • 6
  • 42
  • 81
  • MySQL simply doesn't know which id you're comparing, and doesn't have the insight to spot that in this instance it makes no difference – Strawberry Mar 10 '17 at 01:51
  • Doesn't matter the fact that you are making it unique (by joining then). Think on sql parser, how it suppose to know which column (id) from which table it should pick to filter? – Jorge Campos Mar 10 '17 at 01:52
  • To an human eye this is obvious, but for a computer not much. In a small case like this it would be plain simple to tweak it to work in the parser, now imagine this query with hundreds (exaggeration to enforce the argument) of tables with the ID.... – Jorge Campos Mar 10 '17 at 01:54
  • Doesn't it perform the JOIN before filtering the columns though? Then couldn't it simply track which columns it JOINs upon and, if they share a name, treat them as a single column (or auto-alias them to the common name in order to prevent the enumeration for every common column in the select)? It seems like this could be done at the parser level with O(1) performance. I must be missing something, but obviously I don't know what. – Bing Mar 10 '17 at 01:58
  • Perhaps it could be done, but why? It's more processing to save you a few keystrokes. At any rate, no its not done. So that's why you have to specify. – Ilion Mar 10 '17 at 02:01
  • *Perhaps it could be done, but why? It's more processing to save you a few keystrokes.* Any modern language is based on this principle. What I'm wondering is if there's a valid logical reason other than "we simply haven't implemented that idea yet." – Bing Mar 10 '17 at 02:03
  • The database standards (SQL ANSI) is pretty old, so the **why** they didn't it? Back in the day everything you could save in performance was important, so it became standard. Also, it is not as simple as you think. imagine this squema: `t1 (id, name); t2(id, id_t1, title)` the natural join here would be `t1.id=t2.id_t1` how the parser will know which id should it uses if you added an where clause `where id=2`? – Jorge Campos Mar 10 '17 at 02:11
  • Also your model is not common at all, in your case the table employees is not needed the dept and hired columns should be in people table. – Jorge Campos Mar 10 '17 at 02:14
  • You have made the columns equal by the expression `ON employees.id=people.id` but they are still _two_ different columns. What if you had included `SELECT name as id` in the query? Now which of the _three_ `id`s should the `where` clause apply to? Just because it looks natural, and makes sense to a human brain does not mean it works with digital logic. – Chindraba Mar 10 '17 at 02:17
  • @Jorge My example was simplified. I was writing queries for my table which was more like people_id in both tables when I realized it was getting redundant. I was only suggesting it recognize when both tables join on the same column name (and no other tables have the joined names) to alias it. Perhaps I'm in a rare group who uses table.table_id structure for my tables instead of table.id – Bing Mar 10 '17 at 02:19
  • You did be surprised on how big is this "group" :) ! Even with your example with columns with the same name you have to consider other cases like a third table in the joining with a column with same name of the other two that are being joined and it isn't part of the join so you would have would be ok for the first two, but what about the third? `t1(id, name); t2(id, job); t3(id, id_t2, bla)` how it would solve the `where id=2` clause? The parser would have to process specific small cases, so, doesn't worth it. – Jorge Campos Mar 10 '17 at 02:26
  • Using your three table example: that would be a pretty bad naming convention, and wouldn't apply for the "rare group" of people I'm in who are using strict `table_id` naming conventions for *every* table. For example, if I want to JOIN email_addresses.person_id=people.person_id=employees.person_id so that I get one row per email address with the person's name, I know the person_id is going to be common because I'm not going to foolishly create a DIFFERENT person_id column for no reason, so PHP `$row['person_id']` would be handy. Seemed practical to me, but maybe I'm a "rare group" :) – Bing Mar 10 '17 at 16:23
  • I applaud the use of strict naming conventions. Since they are in use for you, the `NATURAL JOIN` seems even more useful for your system. Creating a virtual table in a sub select makes your intentions for the question and for PHP `$row['person_id']` work in a manner that matches your though processes. – Chindraba Mar 12 '17 at 05:51

3 Answers3

1

What you are doing in your mind is treating the joined tables as one, and in that view there is a single id column. MySQL can accomodate that view by using a sub-select and a natural join.

SELECT * FROM (SELECT * FROM employees NATURAL JOIN people) AS t2 WHERE id=2;

That will collapse all common columns while also using them as the condition of the join. The returned columns will be in the order of the common columns, remaining columns form first table, and remaining columns from second table. In the common columns, if there are more than one, their order will be the same as they are in the first table, even if they have a different order in the second table.

This allows you to implement the logic from within MySQL that you are using in your mind's view of the data.

The caveat to this is that if there is more than one common column, the results may not be what you'd expect. Then the join will force each column to match before they are considered a "match". If the columns combine to form a key, then this is expected behavior. However, if there happens to be two columns with the same name, but different usage that is not part of the key, you can end up with an empty set.

Chindraba
  • 820
  • 1
  • 18
  • 19
  • `NATURAL JOIN` is the answer, thank you! I may not implement it (because as you and others have warned it can cause unexpected behavior since it no longer forces implicit requests), but since this was a theory question anyway you definitely provided the answer. Thanks for teaching me something! – Bing Apr 24 '17 at 16:24
  • It was a good question, and one I'm sure many people fall for. It _is_ a natural way for our brains to process the data, and when it _doesn't_ work we are left scratching our heads. Questions like this are what makes SO a good place for future seekers of knowledge. – Chindraba Apr 24 '17 at 17:20
0

Because the WHERE you are using, isn't the filter only for the table you put after FROM. It will filter the result of those tables.

The type of thing you want to do is vague even to the human. The query can be more complicated than what you used as example. and with your idea it can be harder to read for debugging.

Also selecting by tables after WHERE will give you more flexibility. for example you can use WHERE employees.id=people.id.

and how about using these type of query inside something like PHP? you are using SELECT * FROM and what do you expect to get by $row['id'] inside PHP?

This type of selecting is good for more clear queries and more flexibility and it solved ambiguity for both machine and human.

ICE
  • 1,667
  • 2
  • 21
  • 43
  • What do you mean with this affirmative: `Also selecting by tables after WHERE will give you more flexibility`? – Jorge Campos Mar 10 '17 at 03:40
  • @JorgeCampos Maybe I used wrong words for what I had in my mind. I mean 1. if you want to filter it by `employees.id=people.id` at last, it doesn't make you to use it just in `ON`. for example `... JOIN T2 ON T1.barcode=T2.barcode and T1.store=T2.store LEFT JOIN ...` can be `... JOIN T2 ON T1.barcode=T2.barcode LEFT JOIN ... where T1.store=T2.store`. 2. sometimes we join multiple tables with same column name but there aren't related with those columns. for example three table joined and we can use `... WHERE T1.caption="something" and T3.caption="something_else"` – ICE Mar 10 '17 at 10:34
0

The simplest answer is that only providing a field reference when MySQL expects a table.field in a WHERE clause when more than one table is being queried is like trying to log into a system without providing your username, only your password. Sorry, your going to have to tell the system who's password you have supplied when you want to log in. Key + Datum = result. Null + Datum = Error.

Taking it a little deeper, MySQL does not make the connection between the two tables as defined in the JOIN statement UNTIL THE QUERY FINISHES. The WHERE clause is processed in sequence to make sure that the tables are filtered as they are joined. If you want to have it for search both tables for an particular ID#, then you need add the syntax for that. Something like:

`SELECT * FROM employees FULL JOIN people ON employees.id=people.id WHERE (employee.id = 2 OR people.id = 2)`

This is different in that a FULL JOIN will still work when one or the other table doesn't have a match in the common table, it still adds to the result with the empty columns of the missing entry being NULLs.

There is a lot out there on JOINs of the various types including this SO post and it was a dupe of an even older SO post that explains this in better detail.

TL:DR - If your working with one table, MySQL knows where to look. When your working with more than one table, MySQL gets confused easy and needs to know for sure what table your working on. Thats why it throws the error at you. The fix is simple, you need only add two things to the query in the, WHERE clause, something like:

'SELECT * FROM employees LEFT JOIN people ON employees.id=people.id WHERE employee.id = 2'

Adding the LEFT to JOIN gives the server extra instructions for the way your combining the assets and adding the table name with the column name will tell MySQL which table it needs to look at.

Community
  • 1
  • 1
Sandor Dosa
  • 133
  • 1
  • 9