0

This is a follow-up question to this StackOverflow post. I'm trying to use two aliases at once, but it doesn't seem to work.

Some background: I'm using Microsoft SQL Server. I have a data set of Posts and Users from a StackExchange site. I have this SQL statement, which successfully lists the sum of points and the sum of views for each user for every post with a score greater than zero.

SELECT Users.Id, SUM(Posts.Score) AS ScoreSum, SUM(Posts.ViewCount) AS ViewSum
  FROM [tsql-dml].[dbo].[Posts] AS Posts
  INNER JOIN [tsql-dml].[dbo].[Users] AS Users
  ON Posts.OwnerUserId = Users.Id
  WHERE Posts.Score > 0
  GROUP BY Users.Id
  ORDER BY ViewSum DESC

Next, what I would like to do is get the ratio of the ViewSum column to the ScoreSum column. As noted in the linked post above, using the column alias directly doesn't work, but (SELECT Alias) works. It seems that including these parenthesis is necessary. I tried this (only showing the select part of the query, the rest is the same):

SELECT Users.Id, SUM(Posts.Score) AS ScoreSum, SUM(Posts.ViewCount) AS ViewSum,
(SELECT(ViewSum)) / (SELECT(ScoreSum)) AS ViewScoreRatio

This gives the following error in SQL:

Msg 207, Level 16, State 1, Line 2
Invalid column name 'ViewSum'.
Msg 207, Level 16, State 1, Line 2
Invalid column name 'ScoreSum'.

So I thought, maybe I have to put the parenthesis around the whole thing. I tried this:

SELECT Users.Id, SUM(Posts.Score) AS ScoreSum, SUM(Posts.ViewCount) AS ViewSum,
(SELECT(ViewSum) / SELECT(ScoreSum)) AS ViewScoreRatio

With this one, the left SELECT for ViewSum doesn't show an error, but the right one fails.

Msg 156, Level 15, State 1, Line 2
Incorrect syntax near the keyword 'SELECT'.
Msg 102, Level 15, State 1, Line 2
Incorrect syntax near ')'

Then I thought, what if I need parenthesis both ways? So I tried this:

SELECT Users.Id, SUM(Posts.Score) AS ScoreSum, SUM(Posts.ViewCount) AS ViewSum,
((SELECT(ViewSum)) / (SELECT(ScoreSum))) AS ViewScoreRatio

This fails with the same message as the first try:

Msg 207, Level 16, State 1, Line 2
Invalid column name 'ViewSum'.
Msg 207, Level 16, State 1, Line 2
Invalid column name 'ScoreSum'.

How do I use two column Aliases in SQL calculations?

The only workaround I've found so far is to re-do the sum, but I fear this is inefficient:

SELECT Users.Id, SUM(Posts.Score) AS ScoreSum, SUM(Posts.ViewCount) AS ViewSum,
SUM(Posts.ViewCount) / SUM(Posts.Score) AS ViewScoreRatio
Aaron Franke
  • 3,268
  • 4
  • 31
  • 51

2 Answers2

3

You cannot use alias* in the same query where it's defined, but it's easy to put it in a subquery

SELECT Id,ScoreSum,ViewSum, ViewSum/ScoreSum AS ViewScoreRatio FROM 
   ( SELECT 
        Users.Id, 
        SUM(Posts.Score) AS ScoreSum, 
        SUM(Posts.ViewCount) AS ViewSum 
     FROM ... ) r1

*see 3rd and 4th comments that precise this point.

Dri372
  • 1,275
  • 3
  • 13
  • 1
    to re-do the sum, but I fear this is inefficient : Not necessary due to sql engine. – Dri372 Jun 16 '21 at 21:22
  • This works, but it is MUCH slower than just re-doing the sum. This takes about 10 seconds to execute, but re-doing the sum takes about half a second. – Aaron Franke Jun 16 '21 at 21:27
  • 1
    You can use aliases in the same query only in specific locations such as in the `ORDER BY` clause, `GROUP BY` and `HAVING` but different engines accept the in different places. – The Impaler Jun 16 '21 at 21:27
  • 2
    According to the ANSI/ISO SQL standard, the column alias can only be used in the ORDER BY clause. – jarlh Jun 16 '21 at 21:29
  • 1
    t is MUCH slower ... :-) see my first comment, it depends of the db you use (personnaly I like postgresql) and your indexes. – Dri372 Jun 16 '21 at 21:29
  • @jarth probably right but not easy to know/follow sql standard – Dri372 Jun 16 '21 at 21:49
  • That's why I tell you. – jarlh Jun 16 '21 at 22:50
  • 1
    @AaronFranke I can assure you there is no way this is any different than copy-pasting the `sum` calculation, it will be exactly the same. The only way this can be different is if there is a bad plan which you sometimes get due to bad statistics or parameter sniffing, but you would sometimes get that with the other query also. If you have different results than that please demonstrate – Charlieface Jun 16 '21 at 23:59
1

You actually came to the right solution:

SELECT Users.Id
, SUM(Posts.Score) AS ScoreSum
, SUM(Posts.ViewCount) AS ViewSum
, SUM(Posts.ViewCount) / SUM(Posts.Score) AS ViewScoreRatio

FROM [tsql-dml].[dbo].[Posts] AS Posts
  INNER JOIN [tsql-dml].[dbo].[Users] AS Users ON Posts.OwnerUserId = Users.Id

WHERE Posts.Score > 0

GROUP BY Users.Id

ORDER BY ViewSum DESC

That probably won't be slow. I do it all the time. But the table I usually do this against has only about 75 million rows and I'm taking advantage of database partitioning. Your mileage may vary.

Here's the problem with your early attempt:

In order to evaluate...

(SELECT(ViewSum) / SELECT(ScoreSum))

...SQL Server must be able to evaluate...

SELECT(ViewSum)

...and...

SELECT(ScoreSum)

But neither of these is valid. There is no context. SQL Server has no way to know where to get ViewSum or ScoreSum.

dougp
  • 2,810
  • 1
  • 8
  • 31