2

Consider a column named ArticleCategories in Article table. The goal is to delete repeated numbers from ArticleCategories column for each row (.

ArticleCategories
------------
3193;2867;3193;
2871;2923;2923;
3278;3337;3337;
2878;2876;2878;
3720;3680;3680;

Any help on how I can get this done with SQL Server?

desired result

ArticleCategories
------------
3193;2867;
2871;2923;
3278;3337;
2878;2876;
3720;3680;

Thanks in advance if anyone could help.

ruakh
  • 175,680
  • 26
  • 273
  • 307
Ray
  • 311
  • 2
  • 6
  • 15
  • possible duplicate of [T-SQL: Opposite to string concatenation - how to split string into multiple records](http://stackoverflow.com/questions/314824/t-sql-opposite-to-string-concatenation-how-to-split-string-into-multiple-reco) – Tassadaque Sep 10 '13 at 04:52
  • 1
    God, when are programmer going to **stop** concatenating multiple values into a single column?!?! You should stop wasting time on *cleaning" this up - you should instead re-architect it to use **proper, normalized table design** so that you don't need such a kludge in the first place – marc_s Sep 10 '13 at 05:13
  • @marc_s, it's been like this for years :-). Moving to a new platform soon, everything is to be redesigned next, hopefully... – Ray Sep 10 '13 at 05:38
  • @Ray see my answer http://stackoverflow.com/a/18711704/1230248 – Dhaval Sep 10 '13 at 06:07

7 Answers7

2

Found a solution to my problem, hope it'd help the others as well.

CREATE FUNCTION dbo.DistinctList
(
@List VARCHAR(MAX),
@Delim CHAR
)
RETURNS
VARCHAR(MAX)
AS
BEGIN
DECLARE @ParsedList TABLE
(
Item VARCHAR(MAX)
)
DECLARE @list1 VARCHAR(MAX), @Pos INT, @rList VARCHAR(MAX)
SET @list = LTRIM(RTRIM(@list)) + @Delim
SET @pos = CHARINDEX(@delim, @list, 1)
WHILE @pos > 0
BEGIN
SET @list1 = LTRIM(RTRIM(LEFT(@list, @pos - 1)))
IF @list1 <> ''
INSERT INTO @ParsedList VALUES (CAST(@list1 AS VARCHAR(MAX)))
SET @list = SUBSTRING(@list, @pos+1, LEN(@list))
SET @pos = CHARINDEX(@delim, @list, 1)
END
SELECT @rlist = COALESCE(@rlist+',','') + item
FROM (SELECT DISTINCT Item FROM @ParsedList) t
RETURN @rlist
END
GO
SELECT dbo.DistinctList('342,34,456,34,3454,456,aa,bb,cc,aa',',') DistinctList
GO

Giving credit to the Author of the script here. http://blog.sqlauthority.com/2009/01/15/sql-server-remove-duplicate-entry-from-comma-delimited-string-udf/

Ray
  • 311
  • 2
  • 6
  • 15
2

First create the function SplitAndRemoveDuplicates which will split a given string by a delimiter, remove the duplicates, and return the string:

CREATE FUNCTION [dbo].[SplitAndRemoveDuplicates] (@sep VARCHAR(32), @s VARCHAR(MAX))

RETURNS VARCHAR(MAX)  
AS
BEGIN

   DECLARE @t TABLE (val VARCHAR(MAX)) 

   DECLARE @xml XML
   SET @xml = N'<root><r>' + REPLACE(@s, @sep, '</r><r>') + '</r></root>'

   INSERT INTO @t(val) SELECT r.value('.','VARCHAR(MAX)') as Item FROM @xml.nodes('//root/r') AS RECORDS(r)

   ;WITH cte
    AS (SELECT ROW_NUMBER() OVER (PARTITION BY val ORDER BY val desc) RN
    FROM  @t)
    DELETE FROM cte
    WHERE  RN > 1

    RETURN (SELECT val + ';' from @t WHERE val <> '' FOR XML PATH ('') )
END

Make use of the function to update the existing data:

UPDATE Article
SET ArticleCategories = (SELECT dbo.[SplitAndRemoveDuplicates](';', ArticleCategories))

Now you have a reusable function.

credit to:https://stackoverflow.com/a/314917/455770 and https://stackoverflow.com/a/3822833/455770

Community
  • 1
  • 1
Rots
  • 5,506
  • 3
  • 43
  • 51
1

without cursors nor while, is a combination of previous questions in this site before

  1. Split Columns
  2. Join The Rows Back

so the answer is create a string Split function

CREATE FUNCTION [dbo].[StringSplit]
(
  @delimited nvarchar(max),
  @delimiter nvarchar(100)     
) RETURNS @t TABLE
(
-- Id column can be commented out, not required for sql splitting string
  id int identity(1,1), -- I use this column for numbering splitted parts
  val nvarchar(max)
)
AS
BEGIN
  declare @xml xml
  set @xml = N'<root><r>' + replace(@delimited,@delimiter,'</r><r>') + '</r></root>'

  insert into @t(val)
  select 
    r.value('.','varchar(max)') as item
  from @xml.nodes('//root/r') as records(r)

  RETURN
END

The query looks like:

declare @Article  table
(
  ArticleCategories varchar(100)
)

insert into @Article values
('3193;2867;3193;'),
('2871;2923;2923;'),
('3278;3337;3337;'),
('2878;2876;2878;'),
('3720;3680;3680;')

;WITH DistinctArticles AS (
    SELECT DISTINCT ArticleCategories, Val
    FROM @article
    CROSS APPLY dbo.[StringSplit](ArticleCategories, ';')
    WHERE Val <> ''
),
Concatenated AS (
    SELECT 
        ArticleCategories,
        STUFF((
            SELECT '; ' + CAST([Val] AS VARCHAR(MAX)) 
            FROM DistinctArticles
            WHERE (ArticleCategories = Results.ArticleCategories) 
    FOR XML PATH(''),TYPE).value('(./text())[1]','VARCHAR(MAX)'),1,2,'') + ';' AS DistinctArticleCategories
    FROM DistinctArticles Results
    GROUP BY ArticleCategories
)
UPDATE @Article 
SET ArticleCategories = Concatenated.DistinctArticleCategories
FROM @Article a
INNER JOIN Concatenated ON a.ArticleCategories = Concatenated.ArticleCategories
Community
  • 1
  • 1
Luis LL
  • 2,912
  • 2
  • 19
  • 21
  • You took the same approach as I did, upvoted for that. Mine is simpler... let me know what you think: http://stackoverflow.com/a/18711930/455770 – Rots Sep 10 '13 at 06:25
  • 1
    @Rots I upvoted you too. I think that your will do better with a big tables. – Luis LL Sep 10 '13 at 06:43
0

Please try following solution,its fully tested for situation like yours, I hope this will help you.

DECLARE @ArticleCategories varchar(500)
DECLARE Cursor_Article CURSOR FOR 
SELECT ArticleCategories from Article
OPEN Cursor_Article
FETCH NEXT FROM Cursor_Article INTO @ArticleCategories
WHILE @@FETCH_STATUS = 0
BEGIN

   DECLARE @individualNum varchar(500),@ArticleCategoriesNew varchar(500) ,@ArticleCategoriesRem varchar(500)
   SET @ArticleCategoriesNew='';
   SET @ArticleCategoriesRem=@ArticleCategories

   WHILE PATINDEX('%;%',@ArticleCategoriesRem) > 0
   BEGIN
       SET @individualNum = SUBSTRING(@ArticleCategoriesRem, 0, PATINDEX('%;%',@ArticleCategoriesRem)) 

       SET @ArticleCategoriesRem = SUBSTRING(@ArticleCategoriesRem, LEN(@individualNum + ';') + 1, LEN(@ArticleCategoriesRem))
       if(@ArticleCategoriesRem not like '%'+@individualNum+'%')
       set @ArticleCategoriesNew=CASE WHEN ISNULL(@ArticleCategoriesNew,'')='' THEN @individualNum ELSE @ArticleCategoriesNew+';'+@individualNum+';' END

   END
   --If ArticleCategories is unique then leave same condition else Fetch primary key in cursor and append in Where condition of Update query
   Update Article SET ArticleCategories=@ArticleCategoriesNew WHERE ArticleCategories=@ArticleCategories    

   FETCH NEXT FROM Cursor_Article INTO @ArticleCategories
END 
CLOSE Cursor_Article;
DEALLOCATE Cursor_Article;
Aamir Shahzad
  • 6,683
  • 8
  • 47
  • 70
Siddique Mahsud
  • 1,453
  • 11
  • 21
0

try following solution

create a UDF first which is as below

alter function dbo.SplitString2(@inputStr varchar(1000),@del varchar(5))
RETURNS varchar(max)
As
BEGIN

DECLARE @t table(col1 varchar(100))
DECLARE @table table(col1 varchar(100))
DECLARE @ret varchar(max)
Set @ret = ''
INSERT INTO @t
select @inputStr

if CHARINDEX(@del,@inputStr,1) > 0
BEGIN
    ;WITH CTE1 as (
    select ltrim(rtrim(LEFT(col1,CHARINDEX(@del,col1,1)-1))) as col,RIGHT(col1,LEN(col1)-CHARINDEX(@del,col1,1)) as rem from @t
    union all
    select ltrim(rtrim(LEFT(rem,CHARINDEX(@del,rem,1)-1))) as col,RIGHT(rem,LEN(rem)-CHARINDEX(@del,rem,1))
    from CTE1 c
    where CHARINDEX(@del,rem,1)>0
    )

        INSERT INTO @table 
        select col from CTE1
        union all
        select rem from CTE1 where CHARINDEX(@del,rem,1)=0
    END
ELSE
BEGIN
    INSERT INTO @table 
    select col1 from @t
END

Set @ret = (Select distinct col1 + ';'  from @table for xml path(''))

return @ret

END

then you can run following select/update query as per you requirement

update query -- this will update records in your table

update ArticleCategory Set ArticleCategories = dbo.SplitString2(ArticleCategories, ';' )

select query -- this will select distinct record only

Select ArticleCategories , dbo.SplitString2(ArticleCategories, ';' ) from ArticleCategory
Dhaval
  • 2,801
  • 20
  • 39
0

If it doesn't matter which rows must stay:

DELETE FROM Article WHERE (
    SELECT COUNT(*) FROM Article t WHERE
    t.ArticleCategories=Article.ArticleCategories AND t.%%physloc%%<Article.%%physloc%%
)>0;

This is how I would do it in SQLite, not sure if I matched syntax of MS SQL Server...

LS_ᴅᴇᴠ
  • 10,823
  • 1
  • 23
  • 46
-2
SELECT DISTINCT ArticleCategories FROM Article

OR

SELECT ArticleCategories 
FROM Article
GROUP BY ArticleCategories 

AND THIS IS FOR DELETE DUPLICATE VALUES

DELETE
FROM Article
WHERE ArticleCategories NOT IN
(
   SELECT MAX(ArticleCategories)
   FROM Article
   GROUP BY ArticleCategories
)
  • The goal is to delete repeated numbers from ArticleCategories column for each row. This answer doesn't delete them. – Rots Sep 10 '13 at 04:41
  • 2
    I don't think you're interpreting the question correctly. By my reading, the OP has individual values such as `3193;2867;3193;` -- where the semicolons are actually part of the field values -- and (s)he wishes to write an `UPDATE` statement that will modify those values to remove duplicates -- resulting in values such as `3193;2867;`. – ruakh Sep 10 '13 at 04:49
  • Also, even by your apparent interpretation, your `DELETE` statement doesn't really make sense: it says to delete any rows from `Article` whose `ArticleCategories` has a value that doesn't appear in any `Article` rows' `ArticleCategories`. That will never delete anything. – ruakh Sep 10 '13 at 04:51
  • Now you are answering the question, but your answer is incorrect I'm afraid. – Rots Sep 10 '13 at 04:53
  • @user2763202 Don't worry too much. This is a question and answer site, so the answers need to be accurate - don't take it personally. – Rots Sep 10 '13 at 04:59