1

Alright so I have two tables. Table1 has an reference number column (A), and a second column that has a string of randomness (B). Table2 has only one column with a list of values that may or may not be in the string on Table1.

dbo.Tbl_1

+--+---------------------------------------+
|A |B                                      |
+--+---------------------------------------+
|24|BLUE; KITTEN; WHITE; PINK; SLOPE; GREEN|
+--+---------------------------------------+
|51|GREEN; CLOUDY; WHITE; CHIPS            |
+--+---------------------------------------+
|78|PATRIOTS; PINK; PINK; WHITE            |
+--+---------------------------------------+
|22|WHITE; RED; TREES; AMY; GREEN          |
+--+---------------------------------------+

dbo.Tbl_2

+-----+
|C    |
+-----+
|BLUE |
+-----+
|WHITE|
+-----+
|PINK |
+-----+
|BROWN|
+-----+

What sql query would determine how many times a value from Table2 is found in the string on Table1? Basically I want to return the below result set:

+-----+----+
|BLUE |1   |
+-----+----+
|WHITE|4   |
+-----+----+
|PINK |3   |
+-----+----+
|BROWN|NULL|
+-----+----+

FYI: In reality, Table2 has about 200 unique records. Table1 has about 1.6M records with unique reference numbers. Neither tables are static.

DoNotArrestMe
  • 1,285
  • 1
  • 9
  • 20
Bethany_93
  • 13
  • 4
  • is there always going to be only one instance of a color in tbl 1 col B? i.e. in tbl 1 id 24 could never have BLUE twice on the same record – Johnny Jan 19 '14 at 00:30
  • 1
    Rather than asking "How I do X", you should try out some queries on your own, and tell us what you've tried and what specifically isn't working. – Tim Dean Jan 19 '14 at 00:36
  • @Bethany_93 your table `dbo.Tbl_1` not in normal form. Why? – DoNotArrestMe Jan 20 '14 at 08:54

2 Answers2

0

I played around a bit and came up with this SQL fiddle

The relevant SELECT query looks like this (requires two table scans though, I'm sure it can be made more efficient):

select C, sum(dbo.CountOccurancesOfString(B, C)) as number
from Tbl_1 join Tbl_2 on 1=1
group by C
order by number desc

EDIT This is the function I got from this answer:

CREATE FUNCTION dbo.CountOccurancesOfString
(
 @searchString nvarchar(max),
 @searchTerm nvarchar(max)
)
RETURNS INT
AS
BEGIN
 return (LEN(@searchString)-LEN(REPLACE(@searchString,@searchTerm,'')))/LEN(@searchTerm)
END
Community
  • 1
  • 1
Sindri Guðmundsson
  • 1,253
  • 10
  • 24
  • This solution only works as long as all values in Tbl_2 are **not** substrings of other items in the table. For example, **GREEN vs. LIGHTGREEN** (`GREEN` would be double counted in this case). – Wagner DosAnjos Jan 19 '14 at 02:36
  • wdosanjos - you are of course right. I suppose the split approach by M.Ali is better as the countoccurrences function would have to be drastically changed. – Sindri Guðmundsson Jan 19 '14 at 12:04
0

Split Function

CREATE FUNCTION [dbo].[udf_Split]
(
    @RowData nvarchar(2000),
    @SplitOn nvarchar(5)
)  
RETURNS @RtnValue table 
(
    Id int identity(1,1),
    Data nvarchar(100)
) 
AS  
BEGIN 
    Declare @Cnt int
    Set @Cnt = 1

    While (Charindex(@SplitOn,@RowData)>0)
    Begin
        Insert Into @RtnValue (data)
        Select 
            Data = ltrim(rtrim(Substring(@RowData,1,Charindex(@SplitOn,@RowData)-1)))

        Set @RowData = Substring(@RowData,Charindex(@SplitOn,@RowData)+1,len(@RowData))
        Set @Cnt = @Cnt + 1
    End

    Insert Into @RtnValue (data)
    Select Data = ltrim(rtrim(@RowData))

    Return
END

Your Data

DECLARE @Table_1 TABLE ([A] INT,[B] VARCHAR(1000)) 
INSERT INTO @Table_1 VALUES
(24,'BLUE; KITTEN; WHITE; PINK; SLOPE; GREEN'),
(51,'GREEN; CLOUDY; WHITE; CHIPS'), 
(78,'PATRIOTS; PINK; PINK; WHITE'), 
(22,'WHITE; RED; TREES; AMY; GREEN')

DECLARE @Table_2 TABLE (ColumnName VARCHAR(100))
INSERT INTO @Table_2 VALUES 
('BLUE'),('WHITE'),('PINK'),('BROWN')

Query

SELECT T2.ColumnName, TotalNums
 FROM 
(SELECT Data, COUNT(DATA) TotalNums 
FROM  @Table_1 t  CROSS APPLY (SELECT * FROM [dbo].[udf_Split](t.B, ';'))C
GROUP BY Data) T1
RIGHT JOIN @Table_2 T2 
ON T1.Data = T2.ColumnName

Result Set

╔════════════╦═══════════╗
║ ColumnName ║ TotalNums ║
╠════════════╬═══════════╣
║ BLUE       ║ 1         ║
║ WHITE      ║ 4         ║
║ PINK       ║ 3         ║
║ BROWN      ║ NULL      ║
╚════════════╩═══════════╝
M.Ali
  • 67,945
  • 13
  • 101
  • 127