0

I have a table like this:

218 4   AudioVerse     https://www.audioverse.org/english/podcasts/latest   Latest      NULL    2012-03-29 15:32:44.287
222 7   TPB            http://rss.thepiratebay.se/0             Alt     NULL    2012-03-31 17:55:49.223
223 7   EZTV           http://www.ezrss.it/feed/                Alt     NULL    2012-03-31 17:56:41.573
226 11  The Piratebay  http://rss.thepiratebay.se/100               Audio Only  NULL    2012-04-04 14:57:45.377
227 11  The Piratebay  http://rss.thepiratebay.se/200               Video Only  NULL    2012-04-04 14:58:04.650
229 15  ThePirateBay   http://rss.thepiratebay.se/200                       NULL    2012-04-06 22:40:12.730
230 14  The Pirate Bay http://rss.thepiratebay.se/0                     NULL    2012-04-08 00:59:13.217
232 14  AudioVerse     https://www.audioverse.org/english/podcasts/latest           NULL    2012-04-08 01:03:22.787
233 14  EZTV           http://www.ezrss.it/feed/                        NULL    2012-04-08 01:20:55.860
234 17  Twit           http://twit.tv/node/feed                     NULL    2012-04-13 18:59:23.037
235 17  Diggnation     http://revision3.com/diggnation/feed/MP4-Large       Video Large NULL    2012-04-13 19:01:52.817

I want a query or stored procedure to return a table that returns distinct, AND a column with which id's that has this url in common, (The second id column (4,7,11,15,14,17) is the id's in question) in comma-separated form like this:

http://rss.thepiratebay.se/0    3   4,5,7,7,11,11,15,14,14,14,17,17
http://rss.thepiratebay.se/200  2   4,5,7,7,11,11,15,14,14,14,17,17
http://www.ezrss.it/feed/   2   4,5,7,7,11,11,15,14,14,14,17,17
https://www.audioverse.org/english/podcasts/latest  2   4,5,7,7,11,11,15,14,14,14,17,17
http://revision3.com/diggnation/feed/MP4-Large  1   4,5,7,7,11,11,15,14,14,14,17,17
http://twit.tv/node/feed    1   4,5,7,7,11,11,15,14,14,14,17,17
http://rss.thepiratebay.se/100  1   4,5,7,7,11,11,15,14,14,14,17,17

Here the first column is the distinct url, the next is how many times the url is present (differs from the main table above, but you get the point), and a comma-separated string of which id's that these urls has in common.

The comma-separated string in this case is not correct.

The queries I used to create this table is:

DECLARE @listStr VARCHAR(MAX)
SELECT @listStr = COALESCE(@listStr+',' ,'') + cast(userid as varchar)
FROM sites

select url, COUNT(*), (@listStr)
from sites
group by url
order by COUNT(*) desc

I am using SQL Server 2008 R2.

The question I present to you is: how to do this, and which way (if there is more than one) is most efficient?

Any help would be appreciated. I could do this in C# code, but I'd much rather have it in either a stored procedure, or a query, which ever is easiest, and/or faster :P

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Bjørn
  • 1,138
  • 2
  • 16
  • 47

3 Answers3

1

I use the coalesce method. I have found that the resulting concatenated string depends on whether the string was declared with a value prior to using COALESCE. The following script demonstrates this:

DECLARE @Mystring nvarchar( 500 ); --declare but do not set inital value

CREATE TABLE dbo.mytable( id int IDENTITY(1 , 1) PRIMARY KEY , name nvarchar( 40 ));

INSERT INTO mytable( name )
VALUES( 'Peter' ) , ( 'Paul' ) ,( 'Mary' );

SELECT @Mystring = COALESCE( @Mystring + ',' , '' ) + name
  FROM dbo.mytable
  ORDER BY id;

PRINT @Mystring; -- will print 'Peter,Paul,Mary' with no leading comma

SET @Mystring = ''; --now set to initial empty string
SELECT @Mystring = COALESCE( @Mystring + ',' , '' ) + name
  FROM dbo.mytable
  ORDER BY id;

PRINT @Mystring; --will print ',Peter,Paul,Mary' ie with leading comma

I don't know why tsql does this (I am using 2008 R2 Express) but it is worth being aware of. I now initialise string variables as empty strings and use the SUBSTRING() function to remove the leading delimiter to ensure consistent behaviour.

I am quite new to tsql so apologies if this is old hat stuff but I didn't find it covered elsewhere.

Jude Wood
  • 76
  • 6
0

Try this:

select url, count(*), group_concat(userid) 
from sites
group by url
order by COUNT(*) desc
barsju
  • 4,408
  • 1
  • 19
  • 24
  • 1
    What is group_concat? i am using microsoft's sql, not mysql. – Bjørn Apr 15 '12 at 10:04
  • @BjørnØyvindHalvorsen: Then, you should add a tag in your question. Either `[SQL-Server]` or `[MS-Access]` or whatever that "microsoft sql" means. – ypercubeᵀᴹ Apr 15 '12 at 10:05
  • Ah. To bad, cause `group_concat()` would give you precisely what you want.. http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html – barsju Apr 15 '12 at 10:07
  • Maybe this will help: http://stackoverflow.com/questions/451415/simulating-group-concat-mysql-function-in-ms-sql-server-2005 – barsju Apr 15 '12 at 10:08
  • @barsju: It was kind of evident though. MySQL has neither `VARCHAR(MAX)` nor uses `+` for concatenation. – ypercubeᵀᴹ Apr 15 '12 at 10:09
0

@Bjørn Øyvind Halvorsen

This question is relately OLD, but came across it and thought of shedding some light on it..

--- create table

create table test_kin (id int, location varchar(max), url varchar(max)) go

--- insert some values

insert into test_kin values (4 ,'AudioVerse' ,'https://www.audioverse.org/english/podcasts/latest')
insert into test_kin values (7 ,'TPB' ,'http://rss.thepiratebay.se/0')
insert into test_kin values (7 ,'EZTV' ,'http://www.ezrss.it/feed/')
insert into test_kin values (11 ,'The Piratebay' ,'http://rss.thepiratebay.se/100')
insert into test_kin values (11 ,'The Piratebay' ,'http://rss.thepiratebay.se/200')
insert into test_kin values (15 ,'ThePirateBay' ,'http://rss.thepiratebay.se/200')
insert into test_kin values (14 ,'The Pirate Bay' ,'http://rss.thepiratebay.se/0')
insert into test_kin values (14 ,'AudioVerse' ,'https://www.audioverse.org/english/podcasts/latest')
insert into test_kin values (14 ,'EZTV' ,'http://www.ezrss.it/feed/')
insert into test_kin values (17 ,'Twit' ,'http://twit.tv/node/feed')
insert into test_kin values (17 ,'Diggnation' ,'http://revision3.com/diggnation/feed/MP4-Large')

--- Query to produce results

select distinct url
,count(*) as ID
,stuff((
        select ',' + cast([id] as varchar(max))
        from test_kin
        for xml path('')
        ), 1, 1, '') as Result

from test_kin group by url order by ID

HTH, Kin

Kin Shah
  • 612
  • 6
  • 25
  • How does the query find the distinct url? I think we need to specify from which table, and when then using distinct we need a group by, and when using group by, we need to provide both url field, count, and the xml path thing... i might have missunderstood, but when i test this query i get: "Invalid column name 'url'" – Bjørn Jan 23 '13 at 08:47
  • Some how the code is not formatted correctly ... there is a from clause as below: ---- code start -------- select distinct url ,count(*) as ID ,stuff(( select ',' + cast([id] as varchar(max)) from test_kin for xml path('') ), 1, 1, '') as Result from test_kin group by url order by ID ---- code end -------- I have tested it on SQLServer 2008 R2 and it works fine. This will give you a start on what you are looking to achieve. – Kin Shah Jan 23 '13 at 14:26
  • Ah.. my bad.. sorry (if you add some spaces behind from, group and order they will be formatted correctly), and the query returns a result that is not correct. the same as the query i have placed in my question on the top. – Bjørn Jan 24 '13 at 11:36
  • Hmm.. Interesting ... Can you shed some more light on what your are trying to achive .. Probably something like -- create table , some insert statements , and expected results. Pretty Sure that using the stuff and for xml path .. that can be achieved. – Kin Shah Jan 24 '13 at 16:03