1

I have a column named "Message". In this column there is a data which is HTML code. I need to parse this HTML in SQL then split it 5 different column "Name" - "Surname" - "Email" - "Telephone" - "Message". Here is the HTML format that I need to parse;

<html>
   <body>
      <br><br>
      <table>
         <tr>
            <td>NameSurname</td>
            <td>kaydi peldi sord</td>
         </tr>
         <tr>
            <td>Email</td>
            <td>...@gmail.com</td>
         </tr>
         <tr>
            <td>Telephone</td>
            <td>535...5464</td>
         </tr>
         <tr>
            <td colspan=2>Message</td>
         </tr>
         <tr>
            <td colspan=2>Benfica-Fenerbahçe</td>
         </tr>
      </table>
   </body>
</html>

First, split NameSurname to Name and Surname. The rule is split from last space (in this sample, it should "Name : ejder mehmet" , "Surname : sıkık", then insert other columns directly. How can I do that? Thanks for answers!

Jungleman
  • 286
  • 2
  • 6
  • 20
  • This isnt really a "request to your hearts desire website", its to help users who are actually willing to learn. http://stackoverflow.com/questions/8816194/how-to-parse-html-table-using-php – jagmitg Feb 04 '15 at 09:43
  • As JSG suggests, just don't. Either get it out and parse it in something designed to parse HTML, such as PHP, or, better, split it up and store the various parts when you insert the Message into the database. – Rhumborl Feb 04 '15 at 09:47
  • It is not for website. It is SSIS package. We retrieve data from other company. So I have to do it in SQL. – Jungleman Feb 04 '15 at 09:47
  • Create a code step in your SSIS package and do it in VB/C# – Rhumborl Feb 04 '15 at 09:47
  • Yeah have something else (like PHP) take the data you get, explode it into the proper sections and as it does so, have it write individual queries into a text file, line by line. Then, once that's done, run the queries in your database and have it get updated. Let's hope your raw data has some sort of delimiter between fields. –  Feb 04 '15 at 09:53
  • Convert it to XML datatype and you can query it – Allan S. Hansen Feb 04 '15 at 13:55

1 Answers1

1

I'm a year late, it's not pretty, and it's definitely not 100% safe, but this does the job for me on the rare occasions I need to parse HTML. Create this function first.

CREATE FUNCTION dbo.StringBetweenTwoPatterns (@PrePattern varchar(max) @PostPattern varchar(max), @string varchar(max)) 
RETURNS varchar(Max)
AS 
BEGIN   
DECLARE @WildPre VARCHAR(MAX) = '%' + @PrePattern + '%'
DECLARE @WildPost VARCHAR(MAX) = '%' + @PostPattern + '%'
IF PATINDEX(@WildPre, @String) > 0
    AND PATINDEX(@WildPost, @String) > 0
BEGIN
    DECLARE @RIGHT VARCHAR(MAX) = SUBSTRING(@string, PATINDEX(@WildPre,@string) + LEN(@PrePattern), LEN(@string))
    RETURN LEFT(@RIGHT,(PATINDEX(@WildPost,@RIGHT) - 1))
END
RETURN NULL
END
GO

When you call this function, you have to keep full formatting and white space in the search strings, so it's going to look like this:

SELECT [NameSurname] = StringBetweenTwoPatterns('<td>NameSurname</td>
        <td>','</td>',[Message]

Splitting Name and Surname is something you should be able to extrapolate from the the substring, right, left, and patindex examples above. Or just google some other answers for that.

Steve Eggering
  • 759
  • 2
  • 9
  • 23