5

I want to make a query in sql-server which can make the following output as like column _B from column_A.Columns are varchar type.

  Column_A                                 column_B
  karim,karim,rahim,masud,raju,raju        karim,rahim,masud,raju
  jon,man,jon,kamal,kamal                  jon,man,kamal
  c,abc,abc,pot                            c,abc,pot
Biddut
  • 418
  • 1
  • 6
  • 17
  • 3
    Never, ever store data as comma separated items. It will only cause you lots of trouble. – jarlh Mar 19 '18 at 11:10
  • 2
    Fix your data structure to be more compatible with relational databases. Then restructuring will be much simpler. – Gordon Linoff Mar 19 '18 at 11:11
  • 1
    Why in the first row rahim must be discarded in column B? – Joe Taras Mar 19 '18 at 11:15
  • Thanks,I have edited. – Biddut Mar 19 '18 at 11:17
  • 1
    The example lead to the search: sql remove duplicates from comma separated string. The first page contains https://stackoverflow.com/questions/42918548/remove-duplicates-from-comma-or-pipeline-operator-string and https://stackoverflow.com/questions/20882509/sql-server-2000-remove-duplicates-from-comma-separated-string – Dan D. Mar 19 '18 at 11:17

2 Answers2

6

First of all: You were told in comments alread, that this is a very bad design (violating 1.NF)! If you have the slightest chance to change this, you really should... Never store more than one value within one cell!

If you have to stick with this (or in order to repair this mess), you can go like this:

This is the simplest approach I can think of: Transform the CSV to an XML and call XQuery-function distinct-values()

DECLARE @tbl TABLE(ColumnA VARCHAR(MAX));
INSERT INTO @tbl VALUES
 ('karim,karim,rahim,masud,raju,raju')
,('jon,man,jon,kamal,kamal')
,('c,abc,abc,pot');

WITH Splitted AS
(
    SELECT ColumnA 
          ,CAST('<x>' + REPLACE(ColumnA,',','</x><x>') + '</x>' AS XML) AS TheParts
    FROM @tbl 
)
SELECT ColumnA
      ,TheParts.query('distinct-values(/x/text())').value('.','varchar(250)') AS ColumnB
FROM Splitted;

The result

ColumnA                             ColumnB
karim,karim,rahim,masud,raju,raju   karim rahim masud raju
jon,man,jon,kamal,kamal             jon man kamal
c,abc,abc,pot                       c abc pot

UPDATE Keep the commas

WITH Splitted AS
(
    SELECT ColumnA 
          ,CAST('<x>' + REPLACE(ColumnA,',','</x><x>') + '</x>' AS XML) AS TheParts
    FROM @tbl 
)
SELECT ColumnA
      ,STUFF(
          (TheParts.query
          ('
          for $x in distinct-values(/x/text())
            return <x>{concat(",", $x)}</x>
          ').value('.','varchar(250)')),1,1,'') AS ColumnB
FROM Splitted;

The result

ColumnB
karim,rahim,masud,raju
jon,man,kamal
c,abc,pot
Shnugo
  • 66,100
  • 9
  • 53
  • 114
  • wouldn't it be easier to use the REPLACE in order to return the comma's? something like this? "REPLACE(TheParts.query('distinct-values(/x/text())').value('.','varchar(250)') , ' ' , ',') AS ColumnB" ? – hkravitz Mar 19 '18 at 11:54
  • @hkravitz Well, If the values might include blanks, this would place commas in bad places... I often was hoping for a possibility to pass a delimiter to such `XQuery` functions (same with `data()`) But this is always a blank by default... – Shnugo Mar 19 '18 at 11:56
0

SQL remove duplicates from comma separated string:

Pseudocode: Make a postgresql function that receives as input the comma delimited string, and creates another array in memory. Split the string on comma, trim whitespace and enumerate each item, if the item doesn't appear in the new list, then add it. Finally flatten the new array to string and return.

drop function if exists remove_duplicates_from_comma_separated_string(text);

CREATE or replace FUNCTION remove_duplicates_from_comma_separated_string(arg1 text) 
RETURNS text language plpgsql AS $$ declare 
  item text;  
  split_items text[];  
  ret_items text[];  
  ret_val text; 
BEGIN 
  --split your string on commas and trim whitespace 
  split_items := string_to_array(ltrim(arg1), ','); 
  --enumerate each item, if it doesn't exist in the new array then add it. 
  FOREACH item IN ARRAY split_items LOOP 
    if ( item::text = ANY(ret_items)) then 
    else 
        --append this unique item into ret_items 
        select array_append(ret_items, ltrim(item)) into ret_items; 
    end if;  
  END LOOP; 
  --flatten the final array to a text with comma delimiter 
  SELECT array_to_string(ret_items, ',', '*') into ret_val; 
  return ret_val; 
END; $$;

So now we can invoke the function on a table thustly:

drop table if exists foo_table; 
create table foo_table(name text); 
insert into foo_table values('karim,karim,rahim,masud,raju,raju'); 
insert into foo_table values('jon,man,jon,kamal,kamal'); 
insert into foo_table values('jon,man,kamal'); 
insert into foo_table values('c,abc,poty'); 
insert into foo_table values('c,abc,abc,kotb'); 
select remove_duplicates_from_comma_separated_string(name) from foo_table; 

Which prints:

┌───────────────────────────────────────────────┐ 
│ remove_duplicates_from_comma_separated_string │ 
├───────────────────────────────────────────────┤ 
│ karim,rahim,masud,raju                        │ 
│ jon,man,kamal                                 │ 
│ jon,man,kamal                                 │ 
│ c,abc,poty                                    │ 
│ c,abc,kotb                                    │ 
└───────────────────────────────────────────────┘ 

Code smell haaax factor: 9.5 of 10. Construction crew watches the novice programmer bang in a nail with the $90 sql brand pipe wrench, everyone rolls their eyes.

Eric Leschinski
  • 146,994
  • 96
  • 417
  • 335