0

I encountered a lot of thread about this, the solutions suggested all tend to go the same way, but it is very inconvenient in my case.

Most of the time something like this is suggested.

DECLARE @Actors TABLE ( [Id] INT , [Name] VARCHAR(20) , [MovieId] INT);
DECLARE @Movie TABLE ( [Id] INT, [Name] VARCHAR(20), [FranchiseId] INT );


INSERT  INTO @Actors
    ( Id, Name, MovieId )
VALUES  ( 1, 'Sean Connery', 1 ),
    ( 2, 'Gert Fröbe', 1 ),
    ( 3, 'Honor Blackman', 1 ),
    ( 4, 'Daniel Craig', 2 ),
    ( 5, 'Judi Dench', 2 ),
    ( 2, 'Harrison Ford', 3 )

INSERT  INTO @Movie
    ( Id, Name, FranchiseId )
VALUES  ( 1, 'Goldfinger', 1 ),
    ( 2, 'Skyfall', 1 ),
    ( 3, 'Return of the Jedi', 2 )


SELECT  m.Name ,
    STUFF(( SELECT  ',' + a_c.Name
            FROM    @Actors a_c
            WHERE   a_c.MovieId = m.Id
          FOR
            XML PATH('')
          ), 1, 1, '')
FROM    @Actors a
    JOIN @Movie m ON a.MovieId = m.Id
GROUP BY m.Id ,
    m.Name

The Problem is (how shall I explain?), one does not really access the grouped Items (as Count(), Max(), Min(), ...), one does rebuild the joining pattern of the "outer query" and force in the WHERE statement, that the corresponding values are the same as those in the GROUP BY statement (in the outer query).

If you do not understand what I'm trying to say, I extended the Example above, by one additional table and you will see, that I will also have to extend the "Inner Query"

DECLARE @Actors TABLE ( [Id] INT , [Name] VARCHAR(20) , [MovieId] INT);
DECLARE @Movie TABLE ( [Id] INT, [Name] VARCHAR(20), [FranchiseId] INT );
DECLARE @Franchise TABLE ( [Id] INT , [Name] VARCHAR(20));


INSERT  INTO @Actors
    ( Id, Name, MovieId )
VALUES  ( 1, 'Sean Connery', 1 ),
    ( 2, 'Gert Fröbe', 1 ),
    ( 3, 'Honor Blackman', 1 ),
    ( 4, 'Daniel Craig', 2 ),
    ( 5, 'Judi Dench', 2 ),
    ( 2, 'Harrison Ford', 3 )

INSERT  INTO @Movie
    ( Id, Name, FranchiseId )
VALUES  ( 1, 'Goldfinger', 1 ),
    ( 2, 'Skyfall', 1 ),
    ( 3, 'Return of the Jedi', 2 )

INSERT  INTO @Franchise
    ( Id, Name )
VALUES  ( 1, 'James Bond' ),
    ( 2, 'Star Wars' )


SELECT  f.Name ,
    STUFF(( SELECT  ',' + a_c.Name
            FROM    @Actors a_c
                    JOIN @Movie m_c ON a_c.MovieId = m_c.Id
            WHERE   m_c.FranchiseId = f.Id
          FOR
            XML PATH('')
          ), 1, 1, '')
FROM    @Actors a
    JOIN @Movie m ON a.MovieId = m.Id
    JOIN @Franchise f ON m.FranchiseId = m.Id
GROUP BY f.Id ,
    f.Name

And now, going somewhat further, imagine a huge query, very complicated, several grouping values over many tables. Performance is an issue. I don't want to rebuild the whole joining pattern in the "inner query".

So is there any other way? A way that does not kill performance and you do not have to duplicate the joining pattern?

Luinil
  • 85
  • 7
  • I am sorry, but I don't understand your question. What exactly is not working correctly here? (For example, when you say, _"one does not [do this and that]"_, are you referring to the queries you're showing, or do you refer to what you would *like to* do (but don't know how)?) – stakx - no longer contributing Sep 17 '14 at 08:50
  • My example does work, yes. But it is a very simple grouping statement and I am having a grouping problem that is way huger and more complicted, 20 joins, 15 grouping keys .... My problem is, that on complicated queries, the solution given does not really satisfy. It come with bad performance and is fragile and error-prone, since you must do all the joining again. Furthermore maintance is complicated as well. – Luinil Sep 17 '14 at 08:55
  • Possible duplicate of [How to use GROUP BY to concatenate strings in SQL Server?](https://stackoverflow.com/questions/273238/how-to-use-group-by-to-concatenate-strings-in-sql-server) – IvanH Apr 03 '19 at 13:51

1 Answers1

0

Contrary to what I said in this comment, you need no GROUP BY clause, nor a WHERE clause, at all!

You simply need the outer SELECT to "iterate" over all franchises (or whatever you want to group by). Then in the inner SELECT, you need some JOINs to get to the franchise key column. Instead of a WHERE clause to filter by the outer franchise's key, simply use the outer franchise key directly in the INNER JOIN:

SELECT f.Name AS FranchiseName,
       COALESCE(STUFF((SELECT DISTINCT ', ' + a.Name
                       FROM @Actor a
                       JOIN @Movie m ON a.MovieId = m.Id
                       WHERE m.FranchiseId = f.Id
                       ORDER BY ', ' + a.Name -- this is optional
                       FOR XML PATH('')), 1, 1, ''), '') AS ActorNames
FROM @Franchise f

Source of information: "High Performance T-SQL Using Window Functions" by Itzik Ben-Gak. Because SQL Server unfortunately does not have an aggregate/window function for concatenating values, the book's author recommends something like the above as the next best solution.

P.S.: I've removed my previous solution that substituted an additional JOIN for a WHERE clause; I am now fairly certain that a WHERE clause is likely to perform better. Nevertheless, I left some evidence of my previous solution (i.e. the striked-through text) because of that reference to a comment I made earlier.

Community
  • 1
  • 1
stakx - no longer contributing
  • 83,039
  • 20
  • 168
  • 268
  • That looks very interesting and i'll try right away – Luinil Sep 17 '14 at 09:59
  • The performance is better than in all earlier attempts. its still not the nicest code, but at least performance is better and that is way more important than other things. Thank you very much! – Luinil Sep 17 '14 at 10:27
  • You are welcome! I'm glad to hear you're making progress. Feel free to upvote / accept any useful answers. ;) ;) – stakx - no longer contributing Sep 17 '14 at 10:40
  • My code is not nice, but at least performance is more or less oke. Unless there comes a whole new idea, i don't think it will get any better. Therefore I accept this answer. – Luinil Sep 17 '14 at 12:11