0

Let's say that I have a table with data like this:

keyword          times_phrase_searched
-------          ---------------------
open windows     1000
closed windows   750
open doors       350
closed doors     250
nice window      100
nice windows     50
ugly doors       25

What SQL query would I need in order to group each word separately and sum the search volumes of the phrases that word is present in. So for the example data, the expected result would be:

word             times_word_in_searches
----             ----------------------
windows          1800
open             1350
closed           1000
doors            625
nice             150
window           100
ugly             25
dodecafonico
  • 195
  • 1
  • 10

3 Answers3

1

If you don't have a numbers table, create one (see how here). Then you can use this query to get word counts:

select SUBSTRING_INDEX(SUBSTRING_INDEX(words.keyword, ' ', numbers.n), ' ', -1) word, 
       SUM(words.times_phrase_searched) times_word_in_searches
from numbers 
inner join words on CHAR_LENGTH(words.keyword) - CHAR_LENGTH(REPLACE(words.keyword, ' ', '')) >= numbers.n - 1
group by word
order by sum(words.times_phrase_searched) desc;

This splits your keyword column, so that you get each individual keyword on its own row. Then it's a simple matter to group and sum.

DEMO

Community
  • 1
  • 1
Jerrad
  • 5,240
  • 1
  • 18
  • 23
1

Try this...

DECLARE @temp TABLE 
  ( 
     keyword VARCHAR(max), 
     times   INT 
  ) 

INSERT INTO @temp 
VALUES      ('open windows', 
             1000 ), 
            ('closed windows', 
             750 ), 
            ('open doors', 
             350 ), 
            ('closed doors', 
             250 ), 
            ('nice window', 
             100 ), 
            ('nice windows', 
             50 ), 
            ('ugly doors', 
             25 ) 

DECLARE @allValues VARCHAR(max) = (SELECT Stuff((SELECT 
                                          ',' + Replace(p2.keyword, ' ', 
                                          ',') 
                 FROM   @temp p2 
                 ORDER  BY p2.keyword 
                 FOR xml path(''), type).value('.', 'varchar(max)'), 1, 1, '')) 


-- find distinct words 
SELECT DISTINCT t.element, 
                (SELECT Sum(k.times) 
                 FROM   @temp k 
                 WHERE  k.keyword LIKE '%' + t.element + '%') 
FROM   dbo.Func_split(@allValues, ',') t 

The function Func_split (credit: https://stackoverflow.com/a/21428746/710925)

CREATE FUNCTION [dbo].[func_Split] 
    (   
    @DelimitedString    varchar(8000),
    @Delimiter              varchar(100) 
    )
RETURNS @tblArray TABLE
    (
    ElementID   int IDENTITY(1,1),  -- Array index
    Element     varchar(1000)               -- Array element contents
    )
AS
BEGIN

    -- Local Variable Declarations
    -- ---------------------------
    DECLARE @Index      smallint,
                    @Start      smallint,
                    @DelSize    smallint

    SET @DelSize = LEN(@Delimiter)

    -- Loop through source string and add elements to destination table array
    -- ----------------------------------------------------------------------
    WHILE LEN(@DelimitedString) > 0
    BEGIN

        SET @Index = CHARINDEX(@Delimiter, @DelimitedString)

        IF @Index = 0
            BEGIN

                INSERT INTO
                    @tblArray 
                    (Element)
                VALUES
                    (LTRIM(RTRIM(@DelimitedString)))

                BREAK
            END
        ELSE
            BEGIN

                INSERT INTO
                    @tblArray 
                    (Element)
                VALUES
                    (LTRIM(RTRIM(SUBSTRING(@DelimitedString, 1,@Index - 1))))

                SET @Start = @Index + @DelSize
                SET @DelimitedString = SUBSTRING(@DelimitedString, @Start , LEN(@DelimitedString) - @Start + 1)

            END
    END

    RETURN
END
Community
  • 1
  • 1
Razort4x
  • 3,296
  • 10
  • 50
  • 88
0

;

 with tmp(keyword, times_phrase_searched) as(
select times_phrase_searched, LEFT((keyword, CHARINDEX(' ',(keyword+' ')-1),
 STUFF(Data, 1, CHARINDEX(' ',keyword+' '), '')
from TestData
Union ALL
select times_phrase_searched, LEFT((keyword, CHARINDEX(' ',(keyword+' ')-1),
 STUFF(Data, 1, CHARINDEX(' ',keyword+' '), '')
from Test
Where Data >'')
SELECT
   keyword,
   SUM(times_phrase_searched) AS times_phrase_searched   
FROM tmp
GROUP BY keyword;

I am not sure but please check this...

Jim Macaulay
  • 4,709
  • 4
  • 28
  • 53
ROOMY
  • 53
  • 8