2

I have a table that looks like this:

ProductId, Color
"1", "red, blue, green"
"2", null
"3", "purple, green"

And I want to expand it to this:

ProductId, Color
1, red
1, blue
1, green
2, null
3, purple
3, green

Whats the easiest way to accomplish this? Is it possible without a loop in a proc?

Rich Seller
  • 83,208
  • 23
  • 172
  • 177
TheSoftwareJedi
  • 34,421
  • 21
  • 109
  • 151

8 Answers8

9

Take a look at this function. I've done similar tricks to split and transpose data in Oracle. Loop over the data inserting the decoded values into a temp table. The convent thing is that MS will let you do this on the fly, while Oracle requires an explicit temp table.

MS SQL Split Function
Better Split Function

Edit by author: This worked great. Final code looked like this (after creating the split function):

select pv.productid, colortable.items as color
from product p 
    cross apply split(p.color, ',') as colortable
TheSoftwareJedi
  • 34,421
  • 21
  • 109
  • 151
chilltemp
  • 8,854
  • 8
  • 41
  • 46
  • 3
    For SQL 2016 you can use the built in function: `CROSS APPLY STRING_SPLIT(p.color, ',')` – Dave Oct 18 '16 at 20:23
5

based on your tables:

create table test_table
(
     ProductId  int
    ,Color      varchar(100)
)

insert into test_table values (1, 'red, blue, green')
insert into test_table values (2, null)
insert into test_table values (3, 'purple, green')

create a new table like this:

CREATE TABLE Numbers
(
    Number  int   not null primary key
)

that has rows containing values 1 to 8000 or so.

this will return what you want:

EDIT
here is a much better query, slightly modified from the great answer from @Christopher Klein:

I added the "LTRIM()" so the spaces in the color list, would be handled properly: "red, blue, green". His solution requires no spaces "red,blue,green". Also, I prefer to use my own Number table and not use master.dbo.spt_values, this allows the removal of one derived table too.

SELECT
    ProductId, LEFT(PartialColor, CHARINDEX(',', PartialColor + ',')-1) as SplitColor
    FROM (SELECT 
              t.ProductId, LTRIM(SUBSTRING(t.Color, n.Number, 200)) AS PartialColor
              FROM test_table             t
                  LEFT OUTER JOIN Numbers n ON n.Number<=LEN(t.Color) AND SUBSTRING(',' + t.Color, n.Number, 1) = ','
         ) t

EDIT END

SELECT
    ProductId, Color --,number
    FROM (SELECT
              ProductId
                  ,CASE
                       WHEN LEN(List2)>0 THEN LTRIM(RTRIM(SUBSTRING(List2, number+1, CHARINDEX(',', List2, number+1)-number - 1)))
                       ELSE NULL
                   END AS Color
                  ,Number
              FROM (
                       SELECT ProductId,',' + Color + ',' AS List2
                           FROM test_table
                   ) AS dt
                  LEFT OUTER JOIN Numbers n ON (n.Number < LEN(dt.List2)) OR (n.Number=1 AND dt.List2 IS NULL)
              WHERE SUBSTRING(List2, number, 1) = ',' OR List2 IS NULL
         ) dt2
    ORDER BY ProductId, Number, Color

here is my result set:

ProductId   Color
----------- --------------
1           red
1           blue
1           green
2           NULL
3           purple
3           green

(6 row(s) affected)

which is the same order you want...

KM.
  • 101,727
  • 34
  • 178
  • 212
  • This did exactly what I needed. In my situation I was dealing with data that had been combined (multiple work orders that had run under the same billable line item, I needed to break them apart while maintaining the line item number). – tmountjr Sep 30 '11 at 18:50
4

You can try this out, doesnt require any additional functions:

declare @t table (col1 varchar(10), col2 varchar(200))
insert @t
          select '1', 'red,blue,green'
union all select '2', NULL
union all select '3', 'green,purple'


select col1, left(d, charindex(',', d + ',')-1) as e from (
    select *, substring(col2, number, 200) as d from @t col1 left join
        (select distinct number from master.dbo.spt_values where number between 1 and 200) col2
        on substring(',' + col2, number, 1) = ',') t
Christopher Klein
  • 2,773
  • 4
  • 39
  • 61
  • GREAT ANSWER, this is a much better query than my first try. See my answer for a modified version of this. Your use of table alias values which are the same as column names was confusing, and the use of a system table for numbers forces you to use an extra derived table. Other than that, GREAT JOB! – KM. Apr 01 '09 at 12:21
1

I arrived this question 10 years after the post. SQL server 2016 added STRING_SPLIT function. By using that, this can be written as below.

declare @product table
(
    ProductId int,
    Color     varchar(max)
);
insert into @product values (1, 'red, blue, green');
insert into @product values (2, null);
insert into @product values (3, 'purple, green');

select
    p.ProductId as ProductId,
    ltrim(split_table.value) as Color
from @product p
outer apply string_split(p.Color, ',') as split_table;
Hiroshi
  • 71
  • 1
  • 5
  • Ha! Thanks for wrapping up these old questions! – TheSoftwareJedi Sep 30 '19 at 12:18
  • Definitely a good answer (although I'm using `CROSS APPLY` rather than `OUTER APPLY`). The only point I'd make about this being marked as the accepted answer to the question is that (as mentioned clearly in the answer) "STRING_SPLIT" only became available in SQL 2016, and the question specifically mentions SQL Server 2005 in both the title and the tags. – Richardissimo Oct 19 '22 at 13:15
0

Just convert your columns into xml and query it. Here's an example.

select 
    a.value('.', 'varchar(42)') c
from (select cast('<r><a>' + replace(@CSV, ',', '</a><a>') + '</a></r>' as xml) x) t1
cross apply x.nodes('//r/a') t2(a)
nurettin
  • 11,090
  • 5
  • 65
  • 85
0

Fix your database if at all possible. Comma delimited lists in database cells indicate a flawed schema 99% of the time or more.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
0

I would create a CLR table-defined function for this:

http://msdn.microsoft.com/en-us/library/ms254508(VS.80).aspx

The reason for this is that CLR code is going to be much better at parsing apart the strings (computational work) and can pass that information back as a set, which is what SQL Server is really good at (set management).

The CLR function would return a series of records based on the parsed values (and the input id value).

You would then use a CROSS APPLY on each element in your table.

casperOne
  • 73,706
  • 19
  • 184
  • 253
  • IMHO that is overkill for something so trivial. – James Mar 31 '09 at 20:54
  • @James: It's really simple, MUCH simpler than the code you have to jump through in order to parse stirngs in T-SQL, and it's always going to be faster at parsing. The cross apply is the natural choice here once you have ANY function that parses the lines apart. – casperOne Apr 01 '09 at 05:11
  • depending on the number or rows to process and the length of the CSV colors, a CLR will not scale. It will work fine in this example of 3 rows, but if you have to run this query all day every, it will be slow. A pure SQL query like mine, will be much faster. – KM. Apr 01 '09 at 11:49
  • @mike: That's absolutely not true and I challenge you to show the tests to prove it. The CLR scales just fine, given that SQL Server manages everything the CLR requires (memory, threads) so it can't get too greedy. Additionally, the CLR is always going to be better at procedural code like this. – casperOne Apr 01 '09 at 17:04
  • @mike: I'd be willing to show you my own tests of CLR code vs T-SQL code when parsing strings. I've seen anywhere from a 30%-200% increase in speed when processing strings under 10000 characters, with all methods topping out at 100000 chars or so. – casperOne Apr 01 '09 at 17:18
  • did you split with a loop or a Numbers table? The CLR may have an advantage for long strings. In this example of color names, I'd doubt that the strings are 10,000 or 100,000 characters long. Do a test where you use a Numbers table to split 10,000 rows of 255 long strings? – KM. Apr 02 '09 at 17:32
0

Why not use dynamic SQL for this purpose, something like this(adapt to your needs):

DECLARE @dynSQL VARCHAR(max)
SET @dynSQL = 'insert into DestinationTable(field) values'
select @dynSQL = @dynSQL + '('+ REPLACE(Color,',',''',''') + '),' from Table
SET @dynSql = LEFT(@dynSql,LEN(@dynSql) -1) -- delete the last comma
exec @dynSql

One advantage is that you can use it on any SQL Server version

MelOS
  • 595
  • 5
  • 10