0

I need to write a select query that will search a single table for words taken from a user input query string like "John Doe Engineering". The string can consist of a single or multiple words. The query string will be passed into a stored procedure as a parameter. In total there area about 20 columns that need to be searched. My first thought was something like this:

SELECT *
FROM Employees
WHERE FirstName LIKE '%John%' OR FirstName LIKE '%Doe%' OR FirstName LIKE '%Engineering%'
WHERE LastName LIKE '%John%' OR LastName LIKE '%Doe2%' OR LastName LIKE '%Engineering%'
WHERE Manager LIKE '%John%' OR Manager LIKE '%Doe%'OR Manager LIKE '%Engineering%'
WHERE Department LIKE '%John%' OR Department LIKE '%Doe%'OR Department LIKE '%Engineering%'
--repeat for 16 more table columns

But I'm not sure how to best generate the query syntax based upon the user query string input. Furthermore this seems like it would be a highly inefficient query. Would it be better to look at using full text search in this case? I'm wondering what the best approach might be?

PixelPaul
  • 2,609
  • 4
  • 39
  • 70

1 Answers1

0

As everyone else said - Full text search is probably the best solution for this type of thing. That said, I thought it would be fun to offer a T-SQL solution.

Quick Disclaimer 1

*I would strongly encourage you not to use the solutions below - this was intended to be a fun little SQL exercise; the performance would be bad. Also - I demonstrate two very efficient ways to split a string: one using Jeff Moden's DelimitedSplit8K, the other technique using PARSENAME *

Quick Disclaimer 2

I should point out a problem with concatenating the columns into a single string as a couple people suggested -- it can lead to false positives; consider the following query:

DECLARE @search varchar(100) = 'ab';

WITH sampleData AS (SELECT fn, ln FROM (VALUES ('aa', 'bb'), ('cc', 'dd')) t(fn,ln))
SELECT * 
FROM sampleData
WHERE CONCAT(fn,ln) LIKE '%'+@search+'%';

The above query will return the first record even though the "ab" does not exist in either column. For that reason you would change the WHERE (or CHARINDEX in John's example) to look like this:

WHERE CONCAT(fn, '|||', ln) LIKE '%'+@search+'%';

My Solutions

-- SAMPLE DATA 
-------------------------------------------------
DECLARE @employees TABLE 
(
  FirstName  varchar(100),
  LastName   varchar(100),
  Manager    varchar(100),
  Department varchar(100)
);

INSERT @employees
SELECT * 
FROM 
(
  VALUES 
    ('bob', '****', 'ddd', 'sss'),
    ('fff', 'fred', 'obx', 'ccc'),
    ('Sue', 'abcd', 'ddd', 'zzz'),
    ('ddd', 'dcba', '123', 'fobbb')
) xx(x1, x2, x3, x4);

-- Solution #1: when @search has <= 4 "items"
-------------------------------------------------
DECLARE @search varchar(100) = 'xx bb ff zz';

SELECT e.*
  --,PARSENAME(REPLACE(@search,' ','.'), N) AS matchedPattern
FROM (VALUES (1),(2),(3),(4)) t(n)
CROSS JOIN @employees e
WHERE 
  CHARINDEX
  (
    PARSENAME(REPLACE(@search,' ','.'), N), 
    CONCAT(FirstName, '|||', LastName, '|||', Manager, '|||', Department)
  ) > 0;

-- Solution #2: when @search has (or can have) > 4 "items"
-------------------------------------------------

-- for this you will need delimitedsplit8k: http://www.sqlservercentral.com/articles/Tally+Table/72993/
SELECT e.*
FROM dbo.delimitedsplit8k(@search, ' ')
CROSS JOIN @employees e
WHERE 
  CHARINDEX
  (
    item, 
    CONCAT(FirstName, '|||', LastName, '|||', Manager, '|||', Department)
  ) > 0;
Alan Burstein
  • 7,770
  • 1
  • 15
  • 18
  • the idea of combining the columns is terrible. It will cause slow performance. As i mentioned in my comment above – t-clausen.dk Mar 09 '17 at 22:01
  • @t-clausen.dk - this is why I said " Full text search is probably the best solution" then added that I was including a T-SQL solution "for fun." You offered nothing other than a comment that everyone here, including the OP, already knew. I'll be interested in the solution that you post.... – Alan Burstein Mar 09 '17 at 23:48
  • you are actually down voting my [answer](http://stackoverflow.com/questions/42509024/get-string-between-2-characters-that-repeats-several-times-in-sql-server/42509706#comment72537334_42509706) to another question in revenge of me down voting this ? – t-clausen.dk Mar 10 '17 at 08:30
  • 1
    Not the downvoter, but if you're going to provide a knowingly inefficient solution, you should probably also provide an efficient solution as well. A bad solution won't help anyone that comes across this in the future. – user2366842 Mar 10 '17 at 14:36
  • user2366842 - first sentence - I suggested Full text indexing; I can't think of a faster solution for this. @t-clausen.dk - if you were read my entire post you would agree there is a lot of extremely useful SQL knowledge there. E.g observe how I split a string using parsename and the values constructor - that's MY technique and represents the fastest pre-2016 way to split a string known to man for when there are <= 4 items/tokens; not even a CLR can touch it. Yet you voted my answer down. Are you trying to discourage people with an advanced level of T-SQL knowledge from posting? – Alan Burstein Mar 10 '17 at 14:58