0

Okay, so I know from this question why I cannot reference an aliased column from WHERE, GROUP BY, or HAVING statements.

My problem is that I have this query which is being moved from a Teradata database to SQL Server 2012.

In Teradata, referencing an aliased column in the where, group by, having, and even join statements is perfectly valid.

My question is, how can I perform this query, and others like it, in SQL Server, without having to resort to populating a temporary table to select from first. (This example is just a single part of a large tSQL script containing 10 separate transactions, many of which or more complex than the example provided)

SELECT 
    MAX(CASE WHEN 
            Field_Name = 'Parent Brand Cd' AND 
            DATALENGTH(Field_Val)>1 
        THEN SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1) 
        ELSE NULL 
        END
        ) AS Parent_Brand_Cd,
    MAX(CASE WHEN 
            Field_Name = 'Brand Id' AND 
            DATALENGTH(Field_Val)>1 
        THEN SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1) 
        ELSE NULL 
        END
        ) AS Hotel_Cd,
    @src_sys_id AS Src_Sys_Id,
    MAX(CASE WHEN 
            Field_Name = 'Brand' AND 
            DATALENGTH(Field_Val)>1 
        THEN SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1) 
        ELSE NULL 
        END
        ) AS Temp1,
    CASE Temp1 
        WHEN 'Company1' 
            THEN 'c1'
        WHEN 'Company2' 
            THEN 'c2' 
        WHEN 'Company3' 
            THEN 'c3' 
        ELSE TEmp1 
    END AS Brand_Cd,
    @process_id AS Insert_Process_Id
 FROM dbo.Company -- STAGING
 GROUP BY Parent_Brand_Cd 
 HAVING 
    Parent_Brand_Cd IS NOT NULL AND 
    Hotel_Cd IS NOT NULL

The first issue with this query is that it is creating an aliased column Temp1 and then immediately attempting to perform a CASE statement against it. I can correct this by doing this:

MAX(CASE WHEN 
            Field_Name = 'Brand' AND 
            DATALENGTH(Field_Val)>1 
        THEN SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1) 
        ELSE NULL 
        END
        ) AS Temp1,
    MAX(CASE WHEN 
            Field_Name = 'Brand' AND 
            DATALENGTH(Field_Val)>1 
        THEN 
            CASE 
                WHEN SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1) = 'Company1' THEN 'c1'
                WHEN SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1) = 'Company2' THEN 'c2'
                WHEN SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1) = 'Company' THEN 'c3'
                ELSE SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1)
            END
        ELSE NULL 
        END
        ) AS Brand_Cd,

But this will not help on other parts of the script where the aliased columns are calculation that are then used in other calculations. Also, it does not solve the issue of the aliased columns in the Group By or Having statements.

Is there any way around the limitations of Aliased Columns in SQL Server, without having to create and populate temporary tables all over the place?

EDIT: Working Solution as Proposed by GarethD

SELECT  
            Parent_Brand_Cd,
            Hotel_Cd,
            @src_sys_id AS Src_Sys_Id,
            CASE Temp1 
                WHEN 'Company1' THEN 'C1'
                WHEN 'Company2' THEN 'C2' 
                WHEN 'Company3' THEN 'C3' 
                ELSE TEmp1 
            END AS Brand_Cd,
            @process_id AS Insert_Process_Id,
            @process_id AS Update_Process_Id
        FROM    
            (   
                SELECT  
                    MAX(CASE WHEN Field_Name = 'Parent Brand Cd' THEN FieldValue END) AS Parent_Brand_Cd,
                    MAX(CASE WHEN Field_Name = 'Brand Id' THEN FieldValue END) AS Hotel_Cd,
                    MAX(CASE WHEN Field_Name = 'Brand' THEN FieldValue END) AS Temp1
                FROM    
                    (
                        SELECT  
                            Field_Name,
                            FieldValue = CASE WHEN DATALENGTH(Field_Val) > 1 THEN SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1) END
                        FROM    DEV_STG_TB.dbo.Company_Attributes_3 -- STAGING
                    ) AS c

            ) AS sub
        WHERE   
            Parent_Brand_Cd IS NOT NULL AND 
            Hotel_Cd IS NOT NULL
        GROUP BY 
            Parent_Brand_Cd,
            Hotel_Cd,
            Temp1
Community
  • 1
  • 1
Jeff Beese
  • 388
  • 2
  • 5
  • 19
  • The first query doesn't makes sense to me. You are calculating a `MAX` and giving it the `Parent_Brand_Cd` alias, but then you are grouping by that column?, why?, this isn't valid in SQL Server – Lamak Apr 01 '15 at 16:36
  • I know this isn't valid in SQL Server, unfortunately it is valid in Teradata, which I am trying to move away from. The issue is that the original column name is `Field_Name`, which we have to parse out to get the actual column name as it exists in the database. – Jeff Beese Apr 01 '15 at 16:40
  • 1
    This is beyond the issue of using the alias, you are trying to calculate an aggregation function (`MAX`) by grouping by the same calculation. It doesn't make sense – Lamak Apr 01 '15 at 16:42
  • The company table that I am pulling from only has two columns, `Field_name` and `Field_Val` When the `Field_Name` is "Parent Brand Cd" then is Assigns the `Field_Val` for that row as the Parent Brand Column. This then gets inserted into a table that has the column `Parent_Brand_Cd` – Jeff Beese Apr 01 '15 at 16:50

1 Answers1

4

You can move your query into a subquery, and the reference your aliases. SQL Server is smart enough to be able to optimise this in the same way as the having clause (in the tests I have done at least). Consider the following two queries:

SELECT  Name, [Count]
FROM    (   SELECT  name, [Count] = COUNT(*)
            FROM    sys.Columns
            GROUP BY name
        ) AS sub
WHERE   [Count] > 1;

SELECT  name, [Count] = COUNT(*)
FROM    sys.Columns
GROUP BY name
HAVING COUNT(*) > 1;

The execution plan for both queries is exactly the same:

enter image description here

So your query would end up as something like:

SELECT  Parent_Brand_Cd,
        Hotel_Cd,
        @src_sys_id AS Src_Sys_Id,
        Temp1
        CASE Temp1 
            WHEN 'Company1' THEN 'c1'
            WHEN 'Company2' THEN 'c2' 
            WHEN 'Company3' THEN 'c3' 
            ELSE TEmp1 
        END AS Brand_Cd,
        @process_id AS Insert_Process_Id
FROM    (   SELECT  MAX(CASE WHEN Field_Name = 'Parent Brand Cd' AND DATALENGTH(Field_Val) > 1 THEN SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1) END) AS Parent_Brand_Cd,
                    MAX(CASE WHEN Field_Name = 'Brand Id' AND DATALENGTH(Field_Val) > 1 THEN SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1) END) AS Hotel_Cd,
                    MAX(CASE WHEN Field_Name = 'Brand' AND DATALENGTH(Field_Val) > 1 THEN SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1) END) AS Temp1
            FROM    dbo.Company -- STAGING
            GROUP BY Parent_Brand_Cd 
        ) AS sub
WHERE   Parent_Brand_Cd IS NOT NULL 
AND     Hotel_Cd IS NOT NULL;

You could even reduce the repeated expressions further by using another subquery:

SELECT  Parent_Brand_Cd,
        Hotel_Cd,
        @src_sys_id AS Src_Sys_Id,
        Temp1
        CASE Temp1 
            WHEN 'Company1' THEN 'c1'
            WHEN 'Company2' THEN 'c2' 
            WHEN 'Company3' THEN 'c3' 
            ELSE TEmp1 
        END AS Brand_Cd,
        @process_id AS Insert_Process_Id
FROM    (   SELECT  MAX(CASE WHEN Field_Name = 'Parent Brand Cd' THEN FieldValue END) AS Parent_Brand_Cd,
                    MAX(CASE WHEN Field_Name = 'Brand Id' THEN FieldValue END) AS Hotel_Cd,
                    MAX(CASE WHEN Field_Name = 'Brand' THEN FieldValue END) AS Temp1
            FROM    (   SELECT  Parent_Brand_Cd,
                                Field_Name,
                                FieldValue = CASE WHEN DATALENGTH(Field_Val) > 1 THEN SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1) END
                        FROM    dbo.Company -- STAGING
                    ) AS c
            GROUP BY Parent_Brand_Cd 
        ) AS sub
WHERE   Parent_Brand_Cd IS NOT NULL 
AND     Hotel_Cd IS NOT NULL;

Note, I have removed ELSE NULL from the case expressions as this is redundant.

I am a big fan of using Common Table Expressions instead of subqueries to de-clutter my queries (this is entirely subjective), and also using PIVOT, so I would personally rewrite the above as:

WITH CompanyCTE AS
(   SELECT  Parent_Brand_Cd,
            Field_Name,
            FieldValue = CASE WHEN DATALENGTH(Field_Val) > 1 
                            THEN SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1) 
                        END
    FROM    dbo.Company 
)
SELECT  pvt.Parent_Brand_Cd
        Parent_Brand_Cd = pvt.[Parent Brand Cd],
        Hotel_Cd = pvt.[Brand Id],
        Temp1 = pvt.[Brand]
FROM    CompanyCTE AS c
        PIVOT
        (   MAX(FieldValue)
            FOR Field_Name IN ([Parent Brand Cd], [Brand Id], [Brand])
        ) AS pvt
WHERE   pvt.[Parent Brand Cd] IS NOT NULL
AND     pvt.[Brand Id] IS NOT NULL;

The other advantage is PIVOT gives you direct access to the aggregated columns.

Of course, your other option is to just repeat the aggregate function:

HAVING MAX(CASE WHEN 
                Field_Name = 'Parent Brand Cd' AND 
                DATALENGTH(Field_Val)>1 
            THEN SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1) 
            ELSE NULL 
            END
            ) IS NOT NULL
AND    MAX(CASE WHEN 
                Field_Name = 'Parent Brand Cd' AND 
                DATALENGTH(Field_Val)>1 
            THEN SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1) 
            ELSE NULL 
            END
            ) IS NOT NULL;

ADDENDUM

Having seen the working solution it appears you don't need any grouping at all, so I think the following will work for you:

SELECT  Parent_Brand_Cd,
        Hotel_Cd,
        @src_sys_id AS Src_Sys_Id,
        Temp1
        CASE Temp1 
            WHEN 'Company1' THEN 'c1'
            WHEN 'Company2' THEN 'c2' 
            WHEN 'Company3' THEN 'c3' 
            ELSE TEmp1 
        END AS Brand_Cd,
        @process_id AS Insert_Process_Id
FROM    (   SELECT  MAX(CASE WHEN Field_Name = 'Parent Brand Cd' THEN FieldValue END) AS Parent_Brand_Cd,
                    MAX(CASE WHEN Field_Name = 'Brand Id' THEN FieldValue END) AS Hotel_Cd,
                    MAX(CASE WHEN Field_Name = 'Brand' THEN FieldValue END) AS Temp1
            FROM    (   SELECT  Field_Name,
                                FieldValue = CASE WHEN DATALENGTH(Field_Val) > 1 THEN SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1) END
                        FROM    dbo.Company -- STAGING
                    ) AS c
        ) AS sub
WHERE   Parent_Brand_Cd IS NOT NULL 
AND     Hotel_Cd IS NOT NULL;

The reason being since parent_brand_cd is by definition unique, since it is derived from an aggregate with no grouping, any further grouping, although inconsequential, is redundant.

Or the PIVOT solution.

WITH CompanyCTE AS
(   SELECT  Field_Name,
            FieldValue = CASE WHEN DATALENGTH(Field_Val) > 1 
                            THEN SUBSTRING(Field_Val, 1, DATALENGTH(Field_Val) - 1) 
                        END
    FROM    dbo.Company 
)
SELECT  Parent_Brand_Cd = pvt.[Parent Brand Cd],
        Hotel_Cd = pvt.[Brand Id],
        Src_Sys_Id = @src_sys_id,
        Temp1 = pvt.[Brand],
        Brand_Cd = CASE pvt.[Brand] 
                        WHEN 'Company1' THEN 'c1'
                        WHEN 'Company2' THEN 'c2' 
                        WHEN 'Company3' THEN 'c3' 
                        ELSE TEmp1 
                    END,
        Insert_Process_Id = @process_id 
FROM    CompanyCTE AS c
        PIVOT
        (   MAX(FieldValue)
            FOR Field_Name IN ([Parent Brand Cd], [Brand Id], [Brand])
        ) AS pvt
WHERE   pvt.[Parent Brand Cd] IS NOT NULL
AND     pvt.[Brand Id] IS NOT NULL;
GarethD
  • 68,045
  • 10
  • 83
  • 123
  • Thanks @GarethD, I will give this a try and let you know how it works. – Jeff Beese Apr 01 '15 at 17:00
  • This won't work, the `c` derived table has only 2 columns, `Field_Name` and `FieldValue`, but then you are trying to `GROUP BY Parent_Brand_Cd`. Ah, the `PIVOT` version does has that column though – Lamak Apr 01 '15 at 17:07
  • Thanks GarethD, I had to move the `GROUP BY` to the main part of the query and in `Hotel_Cd` and `Temp1`, but otherwise, this is exactly what I was looking for. Marking as answer. – Jeff Beese Apr 01 '15 at 17:10
  • @JeffBeese I have actually mis-understood your question slightly as I missed that `Parent_Brand_Cd` is the grouping column and an alias, I thought it was just another column. Which leaves me intrigued as to how my answer was even close to right... For no other reason than to satisfy my own curiosity could you post the solution that worked for you as an addendum to your question? – GarethD Apr 01 '15 at 17:17
  • @JeffBeese thanks, I get it now. I have added to my answer as I think the final group by in your query might be unnecessary. – GarethD Apr 01 '15 at 18:18