2

I need to extract multiple strings between 2 specific characters that repeats several times in a row. for example; these are 2 rows from my table:

id myString
1 'aaa(bb)ccc(ddd)'
2 'eeee(ff)gggg(hhh)iii'

my expected results are:

id myString
1  bb
1  ddd
2  ff
2  hhh

How can I do it ?

Thanks in advance.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
user4810167
  • 37
  • 1
  • 4

4 Answers4

2

One method uses recursive CTEs:

with cte as (
      select id,
             left(stuff(myString, 1, charindex('(', myString), ''),
                  charindex(')', myString) - charindex('(', myString) - 1
                 ) as val,
             stuff(myString, 1, charindex(')', myString) + 1) as rest
      from t
      where myString like '%(%)%'
      union all
      select id,
             left(stuff(rest, 1, charindex('(', rest), ''),
                  charindex(')', rest) - charindex('(', rest) - 1
                 ) as val,
             stuff(rest, 1, charindex(')', myString) + 1) as rest
      from cte
      where rest like '%(%)%'
     )
select id, val
from cte;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
2

If the string always follows the repeating pattern of '..(..)' one method to solve this uses a CSV Splitter function by Jeff Moden, replacing the second delimiter with the first delimiter, and getting only the second sets using modulo (%):

select 
    Id
 , myString = x.item
from t
  cross apply (
    select Item = ltrim(rtrim(i.Item))
      from [dbo].[delimitedsplit8K](replace(t.mystring,')','('),'(') as i
      where ItemNumber%2=0
      ) x

test setup: http://rextester.com/DAI48471

Added example input of 3,'jjj(kkk)ll(mmm)n(ooooo)pp(qq)rr'

returns:

+----+----------+
| Id | myString |
+----+----------+
|  1 | bb       |
|  1 | ddd      |
|  2 | ff       |
|  2 | hhh      |
|  3 | kkk      |
|  3 | mmm      |
|  3 | ooooo    |
|  3 | qq       |
+----+----------+

splitting strings reference:

SqlZim
  • 37,248
  • 6
  • 41
  • 59
  • This is a nice solution. Seems to have a problem in cases with bad data like '(()ab' or ')))ab' - this would return ab even thought it is not inside any brackets – t-clausen.dk Feb 28 '17 at 14:46
  • @t-clausen.dk That is true, and I tried to caution that with the first line of my answer "If the string always follows the repeating pattern of `'..(..)'`" – SqlZim Feb 28 '17 at 16:00
1

I know I'm coming very late to the party, but I ran into the exact same problem, and found a very easy way to complete it with a recursive CTE.

I hope this is helpful to anyone running into the same issue.

IF OBJECT_ID('tempdb..#Temp') IS NOT NULL DROP TABLE #Temp
CREATE TABLE #Temp (
    [ID]    INT,
    [Text]  VARCHAR(255)
)

INSERT INTO #Temp ([ID], [Text])
VALUES (1, '(test1) sdkjsdlfnicsn */12e3mdsf ksd (test 2) $#@ewmfdsdk (test3)'), (2, '(test4) sdvdsg */sdg ksd (test 5) $#@ewmfdsdk (test6)(test7)')

;WITH CTE AS (
    SELECT [x].[ID], [x].[Text], [x].[OpenBraket], [x].[CloseBraket]
    ,SUBSTRING([x].[Text], ([x].[OpenBraket] + 1), ([x].[CloseBraket] - [x].[OpenBraket] - 1)) AS [Value]
    ,SUBSTRING([x].[Text], ([x].[CloseBraket] + 1), (LEN([Text]) - [x].[CloseBraket])) AS [RemainingText]
    FROM (
        SELECT [ID], [Text]
            ,CHARINDEX('(', [Text]) AS [OpenBraket]
            ,CHARINDEX(')', [Text]) AS [CloseBraket]        
        FROM #Temp
        WHERE [Text] LIKE '%(%)%'
    ) x

    UNION ALL

    SELECT [z].[ID], [z].[Text], [z].[OpenBraket], [z].[CloseBraket]
        ,SUBSTRING([z].[RemainingText], ([z].[OpenBraket] + 1), ([z].[CloseBraket] - [z].[OpenBraket] - 1)) AS [Value]
        ,SUBSTRING([z].[RemainingText], ([z].[CloseBraket] + 1), (LEN([RemainingText]) - [z].[CloseBraket])) AS [RemainingText]
    FROM (
        SELECT [ID], [Text], [RemainingText]
            ,CHARINDEX('(', [RemainingText]) AS [OpenBraket]
            ,CHARINDEX(')', [RemainingText]) AS [CloseBraket]
        FROM [CTE]
        WHERE [RemainingText] LIKE '%(%)%'
    ) z
)

SELECT [ID], [Value] FROM CTE
Joe Packer
  • 525
  • 3
  • 15
0

Answer works in sqlserver 2016

DECLARE @t table(id int, myString varchar(40))
INSERT @t 
VALUES (1,'aaa(bb)ccc(ddd)'),(2, 'eeee(ff)gggg(hhh)iii')

SELECT id, stuff(value, 1, charindex('(',value),'') myString
FROM 
  @t t
CROSS APPLY
  STRING_SPLIT(mystring,')') 
WHERE value like '%(%'
ORDER BY id

Fiddle

t-clausen.dk
  • 43,517
  • 12
  • 59
  • 92
  • There is no reason to add a presentation ORDER BY to this query and will cause a sort in the execution plan depending on how the table is indexed. – Alan Burstein Mar 09 '17 at 23:58
  • @AlanBurstein I am not going into childish war with you. This is a great answer you down voted because of [this](http://stackoverflow.com/questions/42698979/select-query-to-match-multiple-columns-with-multiple-keywords/42703934#comment72534461_42703934). Index on this has nothing to do with the answer. – t-clausen.dk Mar 10 '17 at 09:06
  • Your answer is fine. A sort in the execution plan will slow your query down provided there is no index in place to handle it. Performance always matters. For the record, I attempted to un-vote down your answer but can't. Cheers. – Alan Burstein Mar 10 '17 at 14:28