I've broken my head against this issue trying to reproduce it on a dynamically generated data set, but it finally worked out!
This is the code
;with tbl as
(
select Id, ClCode, Manager, ChangeDate
from (values
(1, '000005', 'Cierra Vega', '2017-10-05'),
(2, '000005', 'Alden Cantrell', '2017-11-29'),
(3, '000005', 'Alden Cantrell', '2017-11-30'),
(4, '000005', 'Kierra Gentry', '2018-09-05'),
(5, '000005', 'Kierra Gentry', '2018-09-12'),
(6, '000005', 'Pierre Cox', '2018-11-06'),
(7, '000005', 'Thomas Crane', '2019-09-11'),
(8, '000005', 'Thomas Crane', '2019-10-01'),
(9, '000005', 'Miranda Shaffer', '2020-04-27'),
(10,'000360', 'Bradyn Kramer', '2017-10-06')
) as t(Id, ClCode, Manager, ChangeDate)
)
, grouped as
(
select c.ClCode
, count(distinct c.Manager) [ManagerChangesCount]
, STRING_AGG(c.[Manager], ',') within group (order by c.MinChangeDate) [Managers]
, STRING_AGG(c.MinChangeDate, ',') within group (order by c.MinChangeDate) [ChangeDates]
from (
select x.ClCode
, x.[Manager]
, min(x.ChangeDate) [MinChangeDate]
from tbl x
group by x.ClCode, x.[Manager]
) c
group by c.ClCode
)
select *
from grouped
My sample dataset contains data about when (ChangeDate) certain clients (ClCode) had their manager (Manager) changed. It is a part of the real DWH Clients dimension table (SCD Type 2), so these "duplicates" just contain changes in the other columns somewhere.
What I'm trying to achieve: I need a list of client codes with how many times their manager was changed and a comma separated list of these manager names sorted from left to right in their change order
ClCode ManagerChangesCount Managers ChangeDates
000005 6 Cierra Vega,Alden Cantrell,Kierra Gentry,Pierre Cox,Thomas Crane,Miranda Shaffer 2017-10-05,2017-11-29,2018-09-05,2018-11-06,2019-09-11,2020-04-27
000360 1 Bradyn Kramer 2017-10-06
But in fact I receive results without or with some weird sorting
ClCode ManagerChangesCount Managers ChangeDates
000005 6 Alden Cantrell,Cierra Vega,Kierra Gentry,Miranda Shaffer,Pierre Cox,Thomas Crane 2017-11-29,2017-10-05,2018-09-05,2020-04-27,2018-11-06,2019-09-11
000360 1 Bradyn Kramer 2017-10-06
This query returns good sorting:
- if I remove
, count(distinct c.Manager) [ManagerChangesCount]
- if I remove string id=10 (second client)
- if I add a filtering condition where
where ClCode = '000005'
- if I get rid of duplicate rows in a dataset and remove group by from a subquery
But all the same for me it looks like a bug... I mean unsorted results from my query
Guys, if you can figure this out, please help me to understand why sorting is not working