7

This seems like a fairly obvious question, but I haven't been able to think of the proper term for what I am trying to ask, so coming up with reference material for this has been tricky. The answers seem obvious, though.

When examining the Pluralsight training material for SQL Server, they recommended always referring to tables in both "regular" queries (something you might write for a basic web service) and for a SQL query in perhaps a stored procedure, like the following:

[databasename].[dbo].[some_table].[sometimesacolumngoeshere]

Though I have found it very common to come across stored procs, etc., that simply use:

my_column    or    [my_column]

The difference here is obviously that one provides an absolute "address" explicitly, whereas the other is implicit.

How do you know when it is appropriate to use one over the other, and also what do you call this, addressing?

My preference would be to always be explicit, and if you need to save space and/or make things more clear, you could alias to an explicit full "address", correct?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Ray
  • 1,422
  • 2
  • 21
  • 39

7 Answers7

6

You are correct. Basically SQL will attempt to find the field you are looking for "my_column" in all of the tables in your FROM and JOIN sections. If however you happen to have a "my_column" in table A and in table B then you need to explicitly tell it which "my_column" you are looking for by including the table name. This goes on up the chain to dbo and databasename if you have collisions there as well.

Most of the time you will find that people don't explicitly call out the tables a column is in unless they are joining multiple tables.

For example I write my queries like this:

SELECT a.field1, b.field2
FROM tableA AS a
INNER JOIN tableB AS b ON a.id = b.a_id
WHERE a.id = 123

Here I am using the AS to alias tableA and tableB down to more readable a and b. I could have just as easily written my query like this:

SELECT tableA.field1, tableB.field2
FROM tableA
INNER JOIN tableB ON tableA.id = tableB.a_id
WHERE tableA.id = 123

Or like this, if field1 and field2 are unique to there tables, but this gets a bit confusing as to where each piece of data is coming from.

SELECT field1, field2
FROM tableA
INNER JOIN tableB ON tableA.id = tableB.a_id
WHERE tableA.id = 123
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Rob Booth
  • 1,792
  • 1
  • 11
  • 22
  • 1
    I can't disagree with that. Using table aliases definitely help reduce the clutter and still provide the necessary hint to tell which table a field is coming from by looking at the query alone. +1 to you. – David Apr 05 '12 at 17:15
2

SQL Server has four part names:

  • Most often you reference an object by name

    SELECT * FROM MyTable
    
  • Then you can specify the owner or schema of the object:

    SELECT * FROM dbo.MyTable
    
  • Then you can reference the database that the object lives in:

    SELECT * FROM master.dbo.MyTble
    
  • Finally you can reference the table on a different server

    SELECT * FROM test1.master.dbo.MyTable
    

It's explained better than I can on MSDN

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Phil
  • 42,255
  • 9
  • 100
  • 100
  • You should use at least the schema, there is a performance difference. – HLGEM Apr 05 '12 at 17:01
  • 1
    A small difference: http://stackoverflow.com/questions/1112374/sql-server-performance-and-fully-qualified-table-names – Phil Apr 05 '12 at 17:06
  • 1
    http://blogs.msdn.com/b/mssqlisv/archive/2007/03/23/upgrading-to-sql-server-2005-and-default-schema-setting.aspx – HLGEM Apr 05 '12 at 17:41
2

While it is possible to use implict names for columns it is a poor choice from a maintainabilty aspect. I never put production code out that doesn't alias every column because, when you go back a year later to change that report or query, you really don't want to have to figure out which of the 20 tables you joined to that it is in. Further, not specifying makes the database work harder to find the column and you have coding more errors where you have column names that are the same in two tables without a reference. It is a good habit to be in to use explict references.

HLGEM
  • 94,695
  • 15
  • 113
  • 186
2

Pluralsight are wrong.

Given the example of a stored procedure that uses a fully qualified name that refers to objects in the same database, here are two more reasons to not fully qualify in that manner:

  • If you have a stored procedure that refers to an object in the same database, it will break if you rename your database.

  • If you were to start using Visual Studio database tools to add your database scripts to source control, you get lots of warnings to deal with

It is a good idea to understand and utilise schemas properly

Nick.Mc
  • 18,304
  • 6
  • 61
  • 91
1

I think this is one of those questions that's subjective, and will turn out to be a matter of preference, but:

From a readability perspective, I'd argue for being explicit ONLY when necessary. Quite often SQL statements are complex enough without having to read a lot of unnecessary text.

I think that

SELECT TOP 1000 [StoreNumber]
      ,[Address1]
      ,[Address2]
      ,[City]
      ,[St]
      ,[Zip]
      ,[ZipSuffix]
      ,[LocationType]
      ,[LocationSubType]
      ,[Corp]
      ,[Division]
      ,[ZoneNumber]
      ,[DistrictNumber]
      ,[StateNumber] 
  FROM [CommonData].[dbo].[vw_StoreData]

is a LOT more readable than

SELECT TOP 1000 [CommonData].[dbo].[vw_StoreData].[StoreNumber]
      ,[CommonData].[dbo].[vw_StoreData].[[Address1]
      ,[CommonData].[dbo].[vw_StoreData].[[Address2]
      ,[CommonData].[dbo].[vw_StoreData].[[City]
      ,[CommonData].[dbo].[vw_StoreData].[[St]
      ,[CommonData].[dbo].[vw_StoreData].[[Zip]
      ,[CommonData].[dbo].[vw_StoreData].[[ZipSuffix]
      ,[CommonData].[dbo].[vw_StoreData].[[LocationType]
      ,[CommonData].[dbo].[vw_StoreData].[[LocationSubType]
      ,[CommonData].[dbo].[vw_StoreData].[[Corp]
      ,[CommonData].[dbo].[vw_StoreData].[[Division]
      ,[CommonData].[dbo].[vw_StoreData].[[ZoneNumber]
      ,[CommonData].[dbo].[vw_StoreData].[[DistrictNumber]
      ,[CommonData].[dbo].[vw_StoreData].[[StateNumber] 
  FROM [CommonData].[dbo].[vw_StoreData]

(It gets worse when you start joining tables, and even worse if you're joining tables across different databases.)

I can see where you could argue that the second is more readable if you need to know exactly which database, schema, and table a particular field comes from by looking at the query alone.

But in SQL Server, for example, you'd be able to open that query in a designer and see it in a much more friendly graphical view.

IMHO, the only time that I would use the full syntax is when necessary, when crossing table/database/schema boundaries, or if you've got two tables with the same field name.

Example:

SELECT TOP 1000 [CommonData].[dbo].[vw_StoreData].[StoreNumber]
      ,[Address1]
      ,[Address2]
      ,[City]
      ,[St]
      ,[Zip]
      ,[ZipSuffix]
      ,[LocationType]
      ,[LocationSubType]
      ,[Corp]
      ,[Division]
      ,[ZoneNumber]
      ,[DistrictNumber]
      ,[StateNumber] 
  FROM [CommonData].[dbo].[vw_StoreData]
  Inner Join [CommonData].[dbo].[vw_StorePhones]
   ON [CommonData].[dbo].[vw_StorePhones].[StoreNumber] = [CommonData].[dbo].[vw_StoreData].[StoreNumber]

And even in that case, I'd use Table Aliases to shorten it up and make it more readable.

All of that said, in the real world, it's quite likely you'll find yourself working for a company that has already decided the standard format, and you'll need to code according to the company's standard.

David
  • 72,686
  • 18
  • 132
  • 173
1

Objects can be referred through a single or several identifiers.

You can use a single identifier to refer to a object, but if the identifier is ambiguous, you must use more identifiers to uniquely identify the object.

For example, two tables named TableA existing in different schemas must be referenced in the a query using at least a two identifiers, schema_one.TableA and schema_two.TableA.

There's a MDSN page where you can find out more about object names and identifiers.

Regarding the use of several identifiers for object names, if you are more specific, you reduce ambiguity in queries and speed up the parsing of the query, as the database engine doesn't have to resolve ambiguity problems, at the cost of readability of the queries.

My personal preference (for the most commonly used objects) is using schema.Table when referencing tables and column, if a single table is referenced in the query, and table.column if multiple tables are referenced in the query.

Community
  • 1
  • 1
gonsalu
  • 3,154
  • 2
  • 18
  • 19
0

In most cases i would always advise to use the full address just to be safe

    [databasename].[dbo].[some_table].[sometimesacolumngoeshere]

However, this is only really needed when you have multiple databases. I only ever encountered one or two issues selecting databases and this is usually fixed by selecting the correct one in sql server.

Once you are actually inside a query and you have given the table a shortened alias then there really is no need to include the full address because this is already referenced.

E.G

   FROM [databasename].[dbo].[some_table].[sometimesacolumngoeshere] SOME

   SELECT SOME.Name
Ryan Gadsdon
  • 2,272
  • 4
  • 31
  • 56