1

Say I have a table with the following data:

example data

You can see columns a, b, & c have a lot of redundancies. I would like those redundancies removed while preserving the site_id info. If I exclude the site_id column from the query, I can get part of the way there by doing SELECT DISTINCT a, b, c from my_table.

What would be ideal is a SQL query that could turn the site IDs relevant to a permutation of a/b/c into a delimited list, and output something like the following:

desired output

Is it possible to do that with a SQL query? Or will I have to export everything and use a different tool to remove the redundancies?

The data is in a SQL Server DB, though I'd also be curious how to do the same thing with postgres, if the process is different.

Troy
  • 21,172
  • 20
  • 74
  • 103
  • For MSSQL, you have [COALESCE](http://stackoverflow.com/a/194887/342740) and for PostgreSQL you have [string_agg](http://stackoverflow.com/a/43944/342740) but depending on version different methods. Don't forget search is your friend ;) – Prix Jan 14 '16 at 21:03
  • i know an easy way to do this is SQL but not sure about postgres. http://sqlandme.com/2011/04/27/tsql-concatenate-rows-using-for-xml-path/ – JamieD77 Jan 14 '16 at 21:06
  • here is one way to do it with SQL Server http://stackoverflow.com/a/1785923/215752 – Hogan Jan 14 '16 at 21:08
  • Wow, I thought what I was trying to do was kind of odd (and wasn't having much luck searching)... Apparently it's not all that unusual. I was just searching for the wrong stuff. Thanks guys! – Troy Jan 14 '16 at 21:11

3 Answers3

1

For SQL Server, you can use the FOR XML trick as found in the accepted answer in this post.

For your scenario it would look something like this:

SELECT a, b, c,  SiteIds = 
    STUFF((SELECT ', ' + SiteId
           FROM your_table t2 
           WHERE t2.a = t1.a AND t2.b = t1.b AND t2.c = t1.c
          FOR XML PATH('')), 1, 2, '')
FROM your_table t1
GROUP BY a, b, c
Community
  • 1
  • 1
db_brad
  • 903
  • 6
  • 22
1

For Postgres:

select a,b,c, string_agg(site_id::varchar, ',')
from my_table
group by a,b,b;

I assume site_id is a number, and as string_agg() only accepts character value, this needs to be casted to a character string for the aggregation. This is what site_id::text does. Alternatively you can use the cast() operator: string_agg(cast(site_id as varchar), ',')

0

This is generally known as String Aggregation. Many RDBMS's have the ability baked in, and many others don't.

In Postgres you just use the STRING_AGG(<field>, <delimiter>) function, and make sure to add a GROUP BY for your non-aggregated fields. Simple stuff.

In SQL Server.. not so pretty, but folks have functions and whatnot that will allow you to do this (like in this Q/A)

Community
  • 1
  • 1
JNevill
  • 46,980
  • 4
  • 38
  • 63