1

I have a table containing a column of raw email text including headers and message body. This may or may not include a url from my domain. If it does contain a url from my domain, I'd like to add a row to the result and increment the number of occurrences of that URL.

For example, the result should look like this:

Link    Count 
----    -----
a       19
b       5
c       1

What is a sensible approach to this?

user57976
  • 11
  • 2
  • You can use the count function: SELECT COUNT(column_name) FROM table_name; – user986959 Feb 27 '14 at 15:36
  • Are you using Full-Text indexes? Maybe use PATINDEX? http://technet.microsoft.com/en-us/library/ms188395.aspx – NickyvV Feb 27 '14 at 15:40
  • I found [this](http://stackoverflow.com/questions/3031526/sql-function-to-get-count-of-how-many-times-string-appears-in-column) for mysql which is pretty short. I don't know exactly what I think of it (or its performance) but it is pretty clever. – user3334690 Feb 27 '14 at 15:50

1 Answers1

0

I found a function on SQLMag (by Brian Moran) that may be of use: Please note I haven't tested the function, so you might wanna do that yourself :)

CREATE function WordRepeatedNumTimes
(@SourceString varchar(8000),@TargetWord varchar(8000))
RETURNS int
AS
BEGIN
DECLARE @NumTimesRepeated int
    ,@CurrentStringPosition int
    ,@LengthOfString int
    ,@PatternStartsAtPosition int
    ,@LengthOfTargetWord int
    ,@NewSourceString varchar(8000)

SET @LengthOfTargetWord = len(@TargetWord)
SET @LengthOfString = len(@SourceString)
SET @NumTimesRepeated = 0
SET @CurrentStringPosition = 0
SET @PatternStartsAtPosition = 0
SET @NewSourceString = @SourceString

WHILE len(@NewSourceString) >= @LengthOfTargetWord
BEGIN

    SET @PatternStartsAtPosition = CHARINDEX 
(@TargetWord,@NewSourceString)

    IF @PatternStartsAtPosition <> 0
    BEGIN
        SET @NumTimesRepeated = @NumTimesRepeated + 1

        SET @CurrentStringPosition = @CurrentStringPosition + @PatternStartsAtPosition + 
@LengthOfTargetWord

        SET @NewSourceString = substring(@NewSourceString,
 @PatternStartsAtPosition + 
@LengthOfTargetWord, @LengthOfString)

    END
    ELSE
    BEGIN
        SET @NewSourceString = ''
    END

END

RETURN @NumTimesRepeated
END

You can then use it in the following way:

DECLARE @link varchar(max)='http://YourLinkHere.com'
SELECT SUM(dbo.WordRepeatedNumTimes(field, @link))
FROM Table

The original article can be found here

NickyvV
  • 1,720
  • 2
  • 16
  • 18