0

What if I have two tables with primary key joins, i.e. both columns have same name and same data, and, then I have a column name timestamp, which has same name for both tables but different data according to their tables. Could I still use NATURAL JOIN? or is there a workaround considering I need to use SELECT * from both tables without aliasing?

Ari7
  • 19
  • 7

2 Answers2

1

Don't use NATURAL JOIN! It is an abomination. It does not even take properly declared foreign key relationships into account.

You may find the USING clause useful:

select . . .
from a join
     b
     using (pk);

Another problem with NATURAL JOIN is that the join keys are not listed. This can make it really hard to debug code, if something goes wrong. You are experiencing the problem with the TIMESTAMP column.

If you do use USING you can use SELECT * and the USING keys appear only once in the SELECT. Sometimes, this can be a convenience.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Would it give a duplicate column error if I run it as a subquery? – Ari7 Apr 17 '18 at 22:52
  • @Ari7 . . . Although I advise listing the columns explicitly, using `SELECT *` in a subquery can be useful. This will not work in your case because you have the duplicate `TIMESTAMP` columns. However, the join key will not be duplicated. – Gordon Linoff Apr 17 '18 at 22:56
  • Thanks for the answer @Gordon Linoff. I would be needing to run something like this in a subquery, so is there a way to not encounter the duplicate column error, without aliasing. If there is some other way, it'd be great if you could direct me how to do that. – Ari7 Apr 17 '18 at 22:58
  • @Gordon Linoff "This will not work in your case because you have the duplicate `TIMESTAMP` columns." It's always possible to rename in a subquery `SELECT TIMESTAMP AS TIMESTAMP2 ...` then you won't get a name clash with `NATURAL JOIN`. But better to name your columns in the base tables carefully, to avoid those clashes. Unfortunately in practice column naming is not so carefully designed, because designers don't anticipate/are prejudiced against `NATURAL JOIN`. – AntC Apr 17 '18 at 23:44
  • 1
    @AntC . . . I choose the names of my columns and tables *very* carefully after decades of experience. Almost all my tables have columns called `createdAt`, `createdBy`, and `createdOn`, precluding the use of natural joins anyway. That is a separate issue from the fact that they do not use properly declared foreign key relationships and they make queries more unstable and error-prone than alternatives. – Gordon Linoff Apr 18 '18 at 02:25
  • So when your answer says "Don't use `NATURAL JOIN`!"; you mean: you've designed the schema/naming so that it's too hard to use; you _don't_ mean that `NATURAL JOIN` is an abomination. Yes SQL doesn't help at all with same-naming columns for a Foreign Key constraint. Sad! – AntC Apr 18 '18 at 11:13
  • @AntC . . . No, I mean it is an abomination because it does not take into account properly declared foreign key relationships. I'm pretty sure that is what I always say, when I make that statement. – Gordon Linoff Apr 18 '18 at 11:31
  • 1
    Queries don't take foreign key constraints into account *by design*. – onedaywhen Apr 26 '18 at 11:14
0

There are numerous fixes, tha can be used in combination:

• Ensure each data element name is unqiue within its schema. For example, rather than relying on updated_date attribute to gain meaning via its context of appearing in the Companies table, instead name it company_name_updated_date.

• Separate entity attributes and audit attributes into distinct base tables e.g. company_name in the Companies table and company_name_updated_date in the CompaniesAudit table. Putting them in distinct schemas might also make sense.

• Code defensively using views: this is probably something you should be doing anyhow. Consider what attrbutes should be available to a given set of users (use cases, applications, etc), create a set of views to expose these attributes and grant privileges as required. Don't expose audit columns to users who don't need to see them.

• Code defensively in queries ('on the fly'): use drived tables to project only the columns the query requires, therefore if a new column is added to an existing then it isn't a 'breaking change' for that query. Exmaple:

WITH c AS ( SELECT company_id, company_name FROM Companies ),
p AS ( SELECT employee_id, company_name FROM Payroll )
SELECT * FROM c NATURAL JOIN p;
onedaywhen
  • 55,269
  • 12
  • 100
  • 138