0

The script shown here work in SQL Server but NOT in SNOWFLAKE SQL. What is the equivalent in SNOWFLAKE SQL?

SELECT DISTINCT 
    ST2.SubjectID, 
    SUBSTRING((SELECT ',' + ST1.StudentName AS [text()]
               FROM dbo.Students ST1
               WHERE ST1.SubjectID = ST2.SubjectID
               ORDER BY ST1.SubjectID
               FOR XML PATH (''), TYPE).value('text()[1]', 'nvarchar(max)'), 2, 1000) [Students]
FROM 
    dbo.Students ST2

RESULTS FROM SAMPLE BELOW: IT CONCATENATES TEXT FROM ALL THE ROWS INTO A SINGLE TEXT STRING BY ID

I tried the above in SQL Server and it worked, however, I need to use a datawarehouse in Snowflake and snowflake doesn't use XML PATH. They have XMLGET but I can't figure out how to use it.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
liz
  • 1
  • 1
  • 2
    The equivalent of SQL Server [STRING_AGG](https://learn.microsoft.com/en-us/sql/t-sql/functions/string-agg-transact-sql?view=sql-server-ver16) is [LISTAGG](https://docs.snowflake.com/en/sql-reference/functions/listagg.html). The XML is an old way to concatenate string on SQL Server before the String_Agg function was introduced, therefore when translating code between two DBs the focus should be on the intent/behavior and not 1:1 translation of code. – Lukasz Szozda Oct 29 '22 at 07:22

3 Answers3

2

You seem to want listagg. Implementation should look like this

select SubjectId, listagg(distinct StudentName,',') as Students
from your_table
group by SubjectId;
Radagast
  • 5,102
  • 3
  • 12
  • 27
1

As Lukasz mentions, the FOR XML PATH ('') syntax in SQL Server was a common way to implement string aggregation before the existence of an explicate## Heading ## operator in later SQL Server versions. This answer describes how it works in SQL Server.

If you are on a version of SQL Server that support the operator, then you could change your code to use STRING_AGG and test that it gives the correct results on SQL Server. Then to migrate to Snowflake, you can simply change the STRING_AGG keyword to LISTAGG.

If you have a lot of such SQL to convert, you might consider using tooling that will recognize such specialized syntax and convert it to the simpler form.

Paul Vernon
  • 3,818
  • 1
  • 10
  • 23
0

so if your source data look like:

    select * from values
    (1, 'student_a'),
    (1, 'student_b'),
    (1, 'student_c'),
    (2, 'student_z'),
    (2, 'student_a')

both the old code and Phil's code will have a random order, the original code did an ORDER BY SubjectID but that is the value being grouped.

In snowflake an order like this can be done with within group (order by studentname)

so Phil's answer becomes:

select 
    subjectid, 
    listagg(distinct studentname,',') within group (order by studentname) as students
from students
group by subjectid; 

which then gives the results:

SUBJECTID STUDENTS
2 student_a,student_z
1 student_a,student_b,student_c
Simeon Pilgrim
  • 22,906
  • 3
  • 32
  • 45