1

I was asking this question about writing fast inline table valued function in sql server.

The code in the answer is working but I'm asking about that part :

enter image description here

It is clear to me that he wanted to create many numbers ( 1,1,1,1,1,...) and then turn them to sequential numbers (1,2,3,4,5,6....):

In this part :

WITH E1(N) AS (
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)
,E2(N) AS (SELECT 1 FROM E1 a, E1 b)
,E4(N) AS (SELECT 1 FROM E2 a, E2 b)
SELECT * FROM e4 --10000 rows

He created 10000 rows.

This function is widely used and hence my question:

Question :

Why didn't he (Jeff Moden) use :

WITH E1(N) AS (
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)
,E2(N) AS (SELECT 1 FROM E1 a, E1 b , E1 c , E1 d)

SELECT * FROM E2 -- ALSO 10000 rows !!!

But choose to split it into E2 , E4 ?

Community
  • 1
  • 1
Royi Namir
  • 144,742
  • 138
  • 468
  • 792
  • This is an exceptionally narrow question since the only person who could possibly answer it is Jeff Moden. – Damien_The_Unbeliever May 05 '15 at 10:43
  • @Damien_The_Unbeliever ,Still , I'm sure others can explain it since it all boils down to sql programming. :-) – Royi Namir May 05 '15 at 10:44
  • 1
    Yes, it is SQL programming - but sometimes we do things based on personal tastes/patterns that we follow rather than there being a *specific* reason why we write something in a particular way. I suspect that's the case here (especially given that both of your queries produce identical plans) – Damien_The_Unbeliever May 05 '15 at 10:51
  • @Damien_The_Unbeliever You're right. I'll wait a bit and if there is no special answer , i'll delete it. ( I also checked the execution plan , it is identical). – Royi Namir May 05 '15 at 10:52

2 Answers2

3

Although I am not Jeff Moden and do not know his reasoning, I find it likely that he simply used a known pattern for number generation which he himself calls Itzik Ben Gan's cross joined CTE method in this Stack Overflow answer.

The pattern goes like this:

WITH E00(N) AS (SELECT 1 UNION ALL SELECT 1),
     E02(N) AS (SELECT 1 FROM E00 a, E00 b),
     E04(N) AS (SELECT 1 FROM E02 a, E02 b),
     E08(N) AS (SELECT 1 FROM E04 a, E04 b),
     ...

In order to adapt the method for his string splitting function, he apparently found it more convenient to modify the initial CTE to be ten rows instead of two and to cut down the number of cross joining CTEs to two to just cover the 8000 rows necessary for his solution.

Community
  • 1
  • 1
Andriy M
  • 76,112
  • 17
  • 94
  • 154
  • 1
    I would rename the `E00` to `E01` - for consistence. `Ex` will have `2^x` rows. – ypercubeᵀᴹ May 05 '15 at 12:48
  • @ypercube,.. the "E" notation is meant to designate the Exponent being used (although it's usually used for BASE 10 scientific notation). The "Units" place of just about any numbering system (I don't know what the exceptions are) has the Exponent of "0" and that's why E0 was used in this case. – Jeff Moden Nov 28 '16 at 04:15
2

Heh... just ran across this and thought I'd answer.

Andriy M answered it exactly right. It was very much modeled after Itzik Ben-Gan's great original BASE 2 code and, yes, I changed it (as have many others) to Base 10 code just to cut down on the number of cCTEs (Cascading CTEs). The latest code that I and many others use cuts down on the number of cCTEs even further. It also uses the VALUES operator to cut down on the bulk of the code, although there's no performance advantage in doing so.

   WITH  E1(N) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))E0(N)) --10 rows
        ,E4(N) AS (SELECT 1 FROM E1 a, E1 b, E1 c, E1 d)
 SELECT * FROM e4 --10000 rows
;

There are a great many other places where the need for such an on-the-fly creation of a sequence is required. Some need to start the sequence at 0 and others at 1. There's also a much larger range of values needed and, to be honest, I got tired of meticulously writing out code similar to the above so I did what Mr. Ben-Gan and many others have done. I wrote an iTVF called "fnTally". I don't normally use Hungarian Notation for functions but I had two reasons for using the "fn" prefix. 1) is because I still maintain a physical Tally Table and so the function needed to be named differently and 2) I can tell people at work "If you had used the 'eff-n' Tally function I told you about, you wouldn't have this problem" without it actually being an HR violation. ;-)

Just in case anyone should need such a thing, here's the code I wrote for my version of an fnTally function. There's a tiny bit of trade off in allowing it to start at 0 or 1 performance wise but it's worth the extra flexibility, to me anyways. And, yes... you could reduce the number of cCTEs in it by doing 12 CROSS JOINs in the 2nd and final cCTE. I just didn't go that route. You could without harm.

Also note that I still use the SELECT/UNION ALL method to form the first 10 pseudo-rows because I still do a lot of work with folks on 2005 and was stuck using 2005 myself until about 6 months ago. Full documentation is included in the code.

 CREATE FUNCTION [dbo].[fnTally]
/**********************************************************************************************************************
 Purpose:
 Return a column of BIGINTs from @ZeroOrOne up to and including @MaxN with a max value of 1 Trillion.

 As a performance note, it takes about 00:02:10 (hh:mm:ss) to generate 1 Billion numbers to a throw-away variable.

 Usage:
--===== Syntax example (Returns BIGINT)
 SELECT t.N
   FROM dbo.fnTally(@ZeroOrOne,@MaxN) t
;

 Notes:
 1. Based on Itzik Ben-Gan's cascading CTE (cCTE) method for creating a "readless" Tally Table source of BIGINTs.
    Refer to the following URLs for how it works and introduction for how it replaces certain loops. 
    http://www.sqlservercentral.com/articles/T-SQL/62867/
    http://sqlmag.com/sql-server/virtual-auxiliary-table-numbers
 2. To start a sequence at 0, @ZeroOrOne must be 0 or NULL. Any other value that's convertable to the BIT data-type
    will cause the sequence to start at 1.
 3. If @ZeroOrOne = 1 and @MaxN = 0, no rows will be returned.
 5. If @MaxN is negative or NULL, a "TOP" error will be returned.
 6. @MaxN must be a positive number from >= the value of @ZeroOrOne up to and including 1 Billion. If a larger
    number is used, the function will silently truncate after 1 Billion. If you actually need a sequence with
    that many values, you should consider using a different tool. ;-)
 7. There will be a substantial reduction in performance if "N" is sorted in descending order.  If a descending 
    sort is required, use code similar to the following. Performance will decrease by about 27% but it's still
    very fast especially compared with just doing a simple descending sort on "N", which is about 20 times slower.
    If @ZeroOrOne is a 0, in this case, remove the "+1" from the code.

    DECLARE @MaxN BIGINT; 
     SELECT @MaxN = 1000;
     SELECT DescendingN = @MaxN-N+1 
       FROM dbo.fnTally(1,@MaxN);

 8. There is no performance penalty for sorting "N" in ascending order because the output is explicity sorted by
    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

 Revision History:
 Rev 00 - Unknown     - Jeff Moden 
        - Initial creation with error handling for @MaxN.
 Rev 01 - 09 Feb 2013 - Jeff Moden 
        - Modified to start at 0 or 1.
 Rev 02 - 16 May 2013 - Jeff Moden 
        - Removed error handling for @MaxN because of exceptional cases.
 Rev 03 - 22 Apr 2015 - Jeff Moden
        - Modify to handle 1 Trillion rows for experimental purposes.
**********************************************************************************************************************/
        (@ZeroOrOne BIT, @MaxN BIGINT)
RETURNS TABLE WITH SCHEMABINDING AS 
 RETURN WITH
  E1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
            SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
            SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
            SELECT 1)                                  --10E1 or 10 rows
, E4(N) AS (SELECT 1 FROM E1 a, E1 b, E1 c, E1 d)      --10E4 or 10 Thousand rows
,E12(N) AS (SELECT 1 FROM E4 a, E4 b, E4 c)            --10E12 or 1 Trillion rows                 
            SELECT N = 0 WHERE ISNULL(@ZeroOrOne,0)= 0 --Conditionally start at 0.
             UNION ALL 
            SELECT TOP(@MaxN) N = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E12 -- Values from 1 to @MaxN
;
Jeff Moden
  • 3,271
  • 2
  • 27
  • 23