16

Someone can explain the code below to me? I am confused by Group by 1, 2, 3, 4, because there is no variable and value called 1, 2, 3, 4.

proc sql; create table SampleData as select
uniqID, Category, Grade, Value,

count(uniqID) as CNT,
avg(PRICE) as APR,
avg(Value) as ALE

from DataIn where date ge &intdt. group by 1,2,3,4;
quit;
ekad
  • 14,436
  • 26
  • 44
  • 46
user1481397
  • 393
  • 1
  • 9
  • 25
  • 3
    It's the column reference-column1, column2, so if you used select * , its first 4 columns. – Reeza Apr 14 '16 at 16:38

2 Answers2

19

It's a relative reference to the 1st, 2nd, 3rd, and 4th variables in your select statement. Ie. uniqID, Category, Grade, Value.

DanZ
  • 451
  • 2
  • 9
0

This is a really bad habit in my opinion and is not supported by most other database platforms.

It's faster to just write numbers instead of using column names. However, the use of this notation has two big problems:

  • It is not self-explanatory, you have to parse the SELECT columns list to understand the GROUP BY description.
  • It is more error prone for short / quick changes. Now if the SELECT list is changed, the GROUP BY list is applied to the new SELECT list without possibly having planned it. If column names were used in the GROUP BY, the order in the SELECT list would be irrelevant.

Some other opinons: https://sqlblog.org/2009/10/06/bad-habits-to-kick-order-by-ordinal