0

I'm having some trouble parsing some data that I get that is in the following format:

1. FREE_AGENT    "10710,|9452,"
2. FREE_AGENT    "11381,|2918,"
3. FREE_AGENT    "10220,|"
4. FREE_AGENT    "9625,|"
5. FREE_AGENT    "11213,11225,11193,|"

everything before the ,| delimiter is considered a player add, everything after the ,| delimiter is considered a drop.

Ideally I'd like to get this parsed out into rows such as the following:

1. FREE_AGENT    ADD    "10710"
1. FREE_AGENT    DROP    "9452"
2. FREE_AGENT    ADD    "11381"
2. FREE_AGENT    DROP    "2918"
3. FREE_AGENT    ADD    "10220"
4. FREE_AGENT    ADD    "9625"
5. FREE_AGENT    ADD    "11213"
5. FREE_AGENT    ADD    "11225"
5. FREE_AGENT    ADD    "11193"

any help would be appreciated. I have basic t-sql string splitters but not getting much luck such as

CREATE FUNCTION split(
    @delimited NVARCHAR(MAX),
    @delimiter NVARCHAR(100)
) RETURNS @t TABLE (id INT IDENTITY(1,1), val NVARCHAR(MAX))
AS
BEGIN
    DECLARE @xml XML
    SET @xml = N'<t>' + REPLACE(@delimited,@delimiter,'</t><t>') + '</t>'
    INSERT INTO @t(val)
    SELECT  r.value('.','varchar(MAX)') as item
    FROM  @xml.nodes('/t') as records(r)
    RETURN
END
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
user257655
  • 650
  • 1
  • 7
  • 13
  • 2
    How are you getting these data? Is it happening regularly? Mostly I'm asking because this can be done in SQL but it's a pain. If you are already handling the data a little bit before loading it into your DBMS then you may as well split it outside of SQL. – Brad Apr 09 '14 at 01:26

2 Answers2

0

I like using dynamic sql to parse strings. You can replace the delimiters with Union All Select (among some other parsing) to make this work:

-- load your test data
declare @YourTable table(AGENT varchar(10),STRING varchar(50))
insert into @YourTable values
    ('FREE_AGENT','10710,|9452,'),
    ('FREE_AGENT','11381,|2918,'),
    ('FREE_AGENT','10220,|'),
    ('FREE_AGENT','9625,|'),
    ('FREE_AGENT','11213,11225,11193,|')

-- create dynamic query to return parsed STRING value
-- with associated AGENT and ADD_DROP value
declare @sql varchar(max)
set @sql = ''
select @sql = @sql + '
select
    ''' + AGENT + ''' AGENT,
    ''' + ADD_DROP + ''' ADD_DROP,
    ' + replace(STRING,',','
union all
select
    ''' + AGENT + ''' AGENT,
    ''' + ADD_DROP + ''' ADD_DROP,') + '
union all'
from    (
        select
            AGENT,
            'ADD' ADD_DROP,
            replace(left(STRING,charindex('|',STRING)),',|','') STRING
        from @YourTable
        where replace(left(STRING,charindex('|',STRING)),'|','') <> ''
        union all
        select
            AGENT,
            'DROP' ADD_DROP,
            replace(replace(right(STRING,charindex('|',
                reverse(STRING))),'|','') + '|',',|','') STRING
        from @YourTable
        where replace(right(STRING,charindex('|',reverse(STRING))),'|','') <> ''
        ) q

-- remove last union all
set @sql = replace(@sql + '|','union all|','')

-- return results
exec(@sql)
Ron Smith
  • 3,241
  • 1
  • 13
  • 16
0

Since you only have a single ",|" delimiter to separate ADDs from DROPs, you can do that in a CTE so that they return in two different columns, one CSV columns of ADDs and one CSV column of DROPs. Then you can split those with a standard splitter, using a UNION ALL to combine the splitting of one column with the splitting of the other.

Example:
(please note that the example below uses a SQLCLR library called SQL# (SQLsharp) of which I am the author. The split function used here is available in the Free version. If you prefer, there are a couple of other methods to properly split a CSV in SQL Server and I have noted them in this answer)

DECLARE @Data TABLE (AgentID INT NOT NULL PRIMARY KEY,
                     Agent NVARCHAR(50),
                     Actions VARCHAR(500))

INSERT INTO @Data VALUES (1, N'FREE_AGENT', '10710,|9452,')
INSERT INTO @Data VALUES (2, N'FREE_AGENT','11381,|2918,')
INSERT INTO @Data VALUES (3, N'FREE_AGENT','10220,|')
INSERT INTO @Data VALUES (4, N'FREE_AGENT','9625,|')
INSERT INTO @Data VALUES (5, N'FREE_AGENT','11213,11225,11193,|')
INSERT INTO @Data VALUES (6, N'FREE_AGENT','9633,|342,54545,')

;WITH SplitActions AS
(
    SELECT sd.AgentID,
           sd.Agent,
           sd.Actions,
           CHARINDEX(',|', sd.Actions) AS [DelimiterLocation]
    FROM   @Data sd
),
SplitValues AS
(
    SELECT sa.AgentID,
           sa.Agent,
           SUBSTRING(sa.Actions, 1, (sa.DelimiterLocation - 1)) AS [Adds],
           SUBSTRING(sa.Actions, (sa.DelimiterLocation + 2), 500) AS [Drops]
    FROM   SplitActions sa
)
SELECT  sv.AgentID, sv.Agent, N'ADD' AS [Action], addvals.SplitVal AS [ID]
FROM    SplitValues sv
CROSS APPLY SQL#.String_Split4k(sv.Adds, N',', 2) addvals
UNION ALL
SELECT  sv.AgentID, sv.Agent, N'DROP' AS [Action], dropvals.SplitVal AS [ID]
FROM    SplitValues sv
CROSS APPLY SQL#.String_Split4k(sv.Drops, N',', 2) dropvals
ORDER BY [AgentID] ASC, [Action] ASC

Output:

AgentID  Agent       Action   ID
1        FREE_AGENT  ADD      10710
1        FREE_AGENT  DROP     9452
2        FREE_AGENT  ADD      11381
2        FREE_AGENT  DROP     2918
3        FREE_AGENT  ADD      10220
4        FREE_AGENT  ADD      9625
5        FREE_AGENT  ADD      11213
5        FREE_AGENT  ADD      11225
5        FREE_AGENT  ADD      11193
6        FREE_AGENT  ADD      9633
6        FREE_AGENT  DROP     342
6        FREE_AGENT  DROP     54545
Community
  • 1
  • 1
Solomon Rutzky
  • 46,688
  • 9
  • 128
  • 171