3

I have the following string:

bzip2,1,668,sometext,foo,bar

How can I SELECT only sometext,foo,bar? The length of the string preceding the 3rd comma may differ and there may be commas within sometext,foo,bar.

I'd like this in as concise code as possible, i.e. preferably 1 line of code, no loops. But feel free to post any solution you think of.

Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490
Danny Beckett
  • 20,529
  • 24
  • 107
  • 134
  • 2
    @a_horse_with_no_name This has absolutely *zero* to do with the question. – Danny Beckett Sep 12 '12 at 11:54
  • Perhaps this might give you a start: http://stackoverflow.com/questions/8726111/sql-server-find-nth-occurrence-in-a-string – Bridge Sep 12 '12 at 11:55
  • Will there always be exactly 3 commas (4 "elements") ? – Bridge Sep 12 '12 at 11:57
  • @a_horse_with_no_name It isn't comma seperated values. It is a BZ compressed string. Each row has a different string. Please just trust the question here, it is perfectly valid. – Danny Beckett Sep 12 '12 at 11:58
  • @Bridge This is what I meant with "there may be commas within `sometext`". There might be 100 commas. – Danny Beckett Sep 12 '12 at 12:02
  • I've edited the question so we stop getting answers that ignore the fact that there may be commas in the remainder of the string. In the future please make this case more obvious up front. – Aaron Bertrand Sep 12 '12 at 12:49
  • Since @a_horse_with_no_name's post now has 8 upvotes, I'll explain a little why I have something like this in a database. An external application stores BZ2-compressed rich-text notes in this column. It automatically adds 'bzip2,1,668,' to the start of every string (the numbers may differ each time). In order to decompress the string, I need to remove this prefix, otherwise BZDecompress returns that it is not a valid BZ string. Nothing to do with data storage; and certainly not something that can be normalised. – Danny Beckett Sep 14 '12 at 06:32
  • You should have added that explanation right from the start. –  Sep 14 '12 at 15:46

4 Answers4

8

try this:

Do a substring from 3rd comma to the end of the string.. To find the 3 commas , I am using charindex() function 3 times

  declare @str varchar(50)='bzip2,1,668,some,text'

  select substring(@str,
  CHARINDEX(',',@str,CHARINDEX(',',@str,CHARINDEX(',',@str,1)+1)+1)+1,
  LEN(@str)-CHARINDEX(',',@str,CHARINDEX(',',@str,CHARINDEX(',',@str,1)+1)+1))

result:

some,text
Joe G Joseph
  • 23,518
  • 5
  • 56
  • 58
  • Just to comment here, since I see this is still getting more upvotes... it uses double the amount of `CHARINDEX`s as my answer. A better solution is: `declare @v varchar(max) = 'bzip2,1,668,sometext' select substring(@v, CHARINDEX(',', @v, CHARINDEX(',', @v, CHARINDEX(',', @v)+1)+1)+1, len(@v))` (as I posted as my answer). – Danny Beckett Sep 19 '12 at 11:08
2

Code:

declare @input varchar(max) = 'bzip2,1,668,s,o,m,e,t,e,x,t,f,o,o,b,a,r'
--declare @input varchar(max) = 'bzip2,,'
declare @thirdCommaPosition int = nullif(charindex(',', @input, nullif(charindex(',', @input, nullif(charindex(',', @input, 1),0)+1),0)+1 ),0)
select stuff(@input, 1, @thirdCommaPosition, '')

Output:

s,o,m,e,t,e,x,t,f,o,o,b,a,r

Edit

Added nullif's to the third comma calculation part, without them it's possible to get inconsistent results.

Denis Valeev
  • 5,975
  • 35
  • 41
1

I just figured out something that works:

declare @v varchar(max) = 'bzip2,1,668,sometext'
select substring(@v, CHARINDEX(',', @v, CHARINDEX(',', @v, CHARINDEX(',', @v)+1)+1)+1, len(@v))
Danny Beckett
  • 20,529
  • 24
  • 107
  • 134
  • 5
    I would recommend you accept @Joe answer then, as that's exactly the same solution he provided 3 minutes ago. – Mike Perrenoud Sep 12 '12 at 12:03
  • @Mike Yes, I figured this out a little before @Joe posted his answer. His is slightly different though, it has more `1`'s and `@v`'s – Danny Beckett Sep 12 '12 at 12:05
  • 1
    As a rule of thumb, the OP should not post answers to his own question within several minutes of posting. – Denis Valeev Sep 14 '12 at 10:56
  • @DenisValeev Firstly, anyone can post an answer immediately, *whilst* posting their question. Secondly, I'd been searching for a solution for half an hour. By chance I thought of the solution a few minutes after posting the question. You've lost me; at what point is it unacceptable to post an answer? Especially when it's the best answer of the bunch... – Danny Beckett Sep 14 '12 at 10:59
  • Just noticed that @Joe's answer also includes the double the calls to `CHARINDEX()` – Danny Beckett Sep 15 '12 at 16:54
1

Here is another idea

DECLARE @xml AS XML,@str AS VARCHAR(50)
    SET @str='bzip2,1,668,sometext,foo,bar'

    SET @xml = CAST(('<X>'+REPLACE(@str,',' ,'</X><X>')+'</X>') AS XML)


        SELECT FinalResult = STUFF(@str,1,SUM(Length)+3,' ' ) FROM (SELECT 
                                Rn = ROW_NUMBER() OVER(ORDER BY (SELECT 1)) 
                                ,N.value('.', 'varchar(10)') as value
                                ,Length = LEN(N.value('.', 'varchar(10)'))  
                            FROM @xml.nodes('X') as T(N))X 
        WHERE X.Rn<=3

Result

 sometext,foo,bar
Niladri Biswas
  • 4,153
  • 2
  • 17
  • 24
  • 1
    Order is not guaranteed here, so you could potentially get the split elements of the string back in a different order when the post-stuff contains commas. – Aaron Bertrand Sep 12 '12 at 14:20
  • I am generating it (Row Number) in a sequential way.So how will the order change? – Niladri Biswas Sep 12 '12 at 15:01
  • 1
    It might not, but that's not the point, it's not guaranteed! How is it sequential? Your inner order by purposely makes the order arbitrary, and the outer query doesn't use order by at all! – Aaron Bertrand Sep 12 '12 at 15:18