1

I have a Post table:

PostID    PostHTML
===
1         '<p>Hello world! <img height=400 src="hello-world.jpg" height=600></p>'
2         'Bowties <img src="bowtie.gif" /> are cool. <img src="smiley.gif" />'
3         '<TABLE CELLSPACING=0 BORDER=0 CELLPADDING=0><TR><TD><B><FONT FACE="Arial"><P ALIGN="CENTER"><IMG SRC="FOO.GIF"></P></FONT></B></TD><TD><B><FONT FACE="Arial"><P ALIGN="CENTER"><IMG SRC="BAR.GIF"></P></FONT></B></TD><TD><B><FONT FACE="Arial"><P ALIGN="CENTER"><IMG SRC="INTERNET-CIRCA-1997.GIF"></P></FONT></B></TD></TR></TABLE>'
4         '<B><FONT FACE="Arial"><P>Did I mention this data is hideous?</P></B><P>&nbsp;</P></FONT>'

I need to select the src attribute of every image in this table. So far I can get the first occurrence in each row:

select substring(
    posthtml,
    charindex('src="', posthtml),
    charindex('"', posthtml, charindex('src="', posthtml) + 5) - charindex('src="', posthtml) + 1
) from post

This results in:

src="hello-world.jpg"
src="bowtie.gif"
SRC="FOO.GIF"

What I want is this:

src="hello-world.jpg"
src="bowtie.gif"
src="smiley.gif"
SRC="FOO.GIF"
SRC="BAR.GIF"
SRC="INTERNET-CIRCA-1997.GIF"

How do I get all occurrences in every row?

Big McLargeHuge
  • 14,841
  • 10
  • 80
  • 108

3 Answers3

1

What you need is the ability to run the "coss apply" functionality in SQL server. Create a function that decomposes the value in to a result table, then join that result table using the 'cross apply' concept in a join condition.

Doing this would result in a function that produces a table output, with the input of your String, it would run a 'cursor' over the string, and then add values to the result table. If you call the function "SearchSRC", it would look somethin glike:

select capply.Sources
from mytable
     cross apply SearchSrc(mytable.PostHTML) as capply

It can be relatively complicated, but the documentation has an example too.

I put together an SQLFiddle demonstrating this solution here (Note that an early version of my code on SQLFiddle produced an infinite loop, and now it has the @cnt variable which can be used to limit the loops. I would suggest a default input parameter for that).

The significant parts are the function:

CREATE FUNCTION SearchSRC(@html AS NVarChar(max))
RETURNS @SRC Table
(
  cnt     int,
  cstart   int,
  cend     int,
  src     NVarChar(250)
)
as
BEGIN

  declare @lcase NVarchar(max),
          @start int,
          @end int,
          @cnt int = 0

  select @lcase = lower(@html)

  select @start = CharIndex('src="', @lcase, 1) + 5
  select @end = charIndex('"', @lcase, @start)

  --insert into @SRC
  --select @cnt, @start, @end, @lcase

  -- use 5 start for start because we add 5 manually
  while @cnt < 10 and @start > 5 and @end > @start
  begin
    select @cnt = @cnt + 1

    insert into @SRC
    select @cnt, @start, @end, SubString(@html, @start, @end - @start)

    select @start = CharIndex('src="', @lcase, @end + 1) + 5
    select @end = CharIndex('"', @lcase, @start)

  end

  return
END

which is called as part of your query:

select id, crapp.src
from Posts
     cross apply SearchSRC(PostHTML) as crapp
rolfl
  • 17,539
  • 7
  • 42
  • 76
0

Try this -

SELECT 'SRC="' + LEFT(RIGHT(PostHTML, LEN(PostHTML) - CHARINDEX('src="', PostHTML) - 4), CHARINDEX('"', RIGHT(PostHTML, LEN(PostHTML) - CHARINDEX('src="', PostHTML) - 4 - 1))) + '"'
FROM post

Output

src="hello-world.jpg"
src="bowtie.gif"
src="smiley.gif"
SRC="FOO.GIF"
SRC="BAR.GIF"
SRC="INTERNET-CIRCA-1997.GIF"
Krishnraj Rana
  • 6,516
  • 2
  • 29
  • 36
  • This only returns the first occurrence in each row. Also, since not every row has images in it (I failed to mention this) I get some odd results like `> – Big McLargeHuge Oct 01 '14 at 17:45
0

here is your answer with using CTE if you're using 2008 and above:

create table #objids_table ( PostId int, PostHTML nvarchar(max) null )

insert all the values one by one

;with T(PostHTML, starts, pos) as ( select PostHTML, 1, charindex('src', PostHTML) from #objids_table union all select PostHTML, cast(pos + 1 as int), charindex('src', PostHTML, pos + 1) from T where pos > 0 ) select substring(PostHTML, starts -1, charindex('"', PostHTML, starts + 4) - (starts - 2)) as img from T where starts <> 1 order by PostHTML, starts

you can take a look at similar answers here: SQL Server - find nth occurrence in a string

Community
  • 1
  • 1
mannyyysh
  • 339
  • 1
  • 6