7

Ok, using SQL Server 2008. On my web page I have a textbox with jQuery-UI AutoComplete hooked up.

Now I need a stored procedure to search across all columns of a single table(or multiple joined tables I suppose) for a search string coming from the textbox/autocomplete AJAX call, and return "suggested" search strings. I am using the AdventureWorks db for testing(Products table)

So for example, the product table has columns for product name and product number(among others) and I want to return suggested search strings based on user input where they may enter a product name and/or a product number.

I have it working across a single column which was simple. Any ideas?

abatishchev
  • 98,240
  • 88
  • 296
  • 433
stephen776
  • 9,134
  • 15
  • 74
  • 123
  • Thanks everyone for all the great feedback and examples. They all have their uses in various situations and I can see myself using some form of all of them in the future – stephen776 Dec 23 '10 at 14:17

4 Answers4

6

I'm going to suggest full text search (MS' or Lucene will work) The code below use MSSQL FTS as its what I use in my app at the moment.

Install FTS Search if you haven't already. If you have check the service is running. In management studio run this to setup a catalog and add the products table; and Color / Name / Product Number to the catalog.

USE [AdventureWorks]
GO
CREATE FULLTEXT CATALOG [ProductsTest]WITH ACCENT_SENSITIVITY = OFF
AUTHORIZATION [dbo]

GO

USE [AdventureWorks]
GO
CREATE FULLTEXT INDEX ON [Production].[Product] KEY INDEX [PK_Product_ProductID] ON ([ProductsTest]) WITH (CHANGE_TRACKING AUTO)
GO
USE [AdventureWorks]
GO
ALTER FULLTEXT INDEX ON [Production].[Product] ADD ([Color])
GO
USE [AdventureWorks]
GO
ALTER FULLTEXT INDEX ON [Production].[Product] ADD ([Name])
GO
USE [AdventureWorks]
GO
ALTER FULLTEXT INDEX ON [Production].[Product] ADD ([ProductNumber])
GO
USE [AdventureWorks]
GO
ALTER FULLTEXT INDEX ON [Production].[Product] ENABLE
GO

You can then run queries against all columns at once; e.g. Silver (Chosen as its in color and Name)

Select * from production.product where
contains(*, '"Silver*"')

The * on the query will find Silver* so you can use this to build up results as the user types in. One thing to consider is that google make this work in real time - if you are searching a lot of data you to be able to get the data back without interrupting the typing of the user. i think generally people use these searches by typing from the first letter they are looking for - i accept there will be spelling mistakes- you could implement a spell checker after every space they press perhaps to handle that. Or store the searches that are run and look at the mispellings and change the code to handle that based on a mapping (or in FTS using a custom thesaurus.)

Ranking is going to be a fun development issue to any business; are you finding the first result for Mountain Frame -or do you want to weight them by sales or price? If the user types in more than one text term you can use FTS to produce a ranking based on the search string.

select aa.rank, bb.* 
From containstable(production.product, *, '"Mountain" and "Silver*"') aa
inner join production.product bb
on aa.[key] = bb.productid
order by rank desc

This returns 30 rows; and weights based on the user inputted text to determine the first place record. In either case you will likely want to add a coded ranking to tweak the results to suit your business desires - ranking te highest priced widget 1 might not be the way. That is why you are going to store what people searched for / clicked on so you can analyse the results later.

There is a really nice language parser for .Net that translates a google style string query inputted into FTS'able language which gives familiarity for any boolean searches that use your site.

You may also want to add some wisdom of crowds features by auditing against what users have input and ultimately gone to visit and use success maps to alter the final suggestions to actually make them relevant to the user.

As a final suggestion if this is a commercial website you might want to look at Easyask which is a scary great natural language processor

u07ch
  • 13,324
  • 5
  • 42
  • 48
3

Using the soundex function would be the simplest way to match for similar "items" in several columns. But a better matching algorithm that will be nearly as fast to implement is the Levenshtein Edit Distance. Here is the T-SQL implementation wrapped in a function. Use it to match for similar search terms.

EDIT Sample of Levenshtien in action (based on gbn's SQL)

Suppose you named your Levenshtein T-SQL function lvn (just fro brevity's sake) then you could do something like:

SELECT productname FROM foo WHERE productname 
    LIKE '%myinput%' OR lvn(myinput) < 3
UNION
SELECT productnumber FROM foo WHERE productnumber 
    LIKE '%myinput%' OR lvn(myinput) < 3
UNION
...

ORDER BY 1 -- one-based column index sort for UNION queries

Yep. that simple. Btw, i updated the T-SQL levenshtein link to something that makes more sense, and it's an accepted SO answer.

Community
  • 1
  • 1
Paul Sasik
  • 79,492
  • 20
  • 149
  • 189
  • awesome! I will check this out for sure. This should be exactly what im looking for – stephen776 Dec 23 '10 at 14:13
  • yep this works great. I can see this being useful in situation where one is searching for something like product numbers where it would be easy to mis-type 1 or more characters – stephen776 Dec 23 '10 at 14:24
  • It's also good for spell checking and fuzzy lookups. Levenshtein is defintiely something you wanna put in your back pocket. It's incredibly portable and useful. – Paul Sasik Dec 23 '10 at 14:29
  • @Paul Sasik - Any recommendation for an "upper limit" to pass to the levenschtein function?? What would be a good balance between Accuracy and Error/typo tolerance? – stephen776 Dec 23 '10 at 14:44
  • @Paul Sasik - I am also having problems with sorting..I am getting the correct results but cant figure out the best way to sort them in the most relevant order...perhaps I should start a new question for this... – stephen776 Dec 23 '10 at 15:00
  • Well, when you do a UNION query the column names become vague so you actually have to order by index, which is 1-based. So your order by will look like this: ORDER BY 1 – Paul Sasik Dec 23 '10 at 15:15
2

Edit: Use a UNION to join separate queries

SELECT productname FROM foo WHERE productname LIKE '%myinput%'
UNION
SELECT productnumber FROM foo WHERE productnumber LIKE '%myinput%'
UNION
...

There is no automatic way to scan all columns unless you use dynamic SQL

gbn
  • 422,506
  • 82
  • 585
  • 676
  • but what about the SELECT statement? I want to search accross all of the columns in the table/tables and return a single column of matching values – stephen776 Dec 23 '10 at 13:02
  • @Paul Sasik: my first answer wasn't thought through...? – gbn Dec 23 '10 at 13:10
  • I saw your first iteration. It was on the mark or at least on the way. Definitely not downvote-worthy IMO. – Paul Sasik Dec 23 '10 at 13:24
  • This seems to be the simplest method. I have seen mention of the Levenshtein Edit Distance but cant wrap my head around how to use it in this case – stephen776 Dec 23 '10 at 13:29
  • Wasn't me that downvoted...I dont have enough rep to do so lol and I agree...NOT downvote-worthy – stephen776 Dec 23 '10 at 13:30
  • Stephen: I undeleted my post since you showed some interest in my solution. It is a complex algorithm but can be used very simply. See my edit which i'll base on gbn's SQL... – Paul Sasik Dec 23 '10 at 13:40
  • Downvoted as like'%A%' or like '%B%' is a one way ticket to terrible performance. The union solution is better. SQL Can read more than one column - if you enable full text search - below. – u07ch Dec 23 '10 at 13:53
0

I have create a sample SQL that will return google style search result. You can try this T-SQL. You can also add more than one table column searching criteria.

ALTER PROC [dbo].[USP_GetDoctorLookupList]
(
    @SearchText varchar(50),
    @ItemCount int
)
AS
BEGIN
    SET @SearchText = RTRIM(@SearchText) + '%'
    BEGIN
        SELECT TOP (@ItemCount) * 
        FROM
        (
            SELECT
                CASE 
                    WHEN RTRIM(LTRIM(d.cdocname)) LIKE @SearchText then 1 
                    WHEN RTRIM(LTRIM(d.cdeano)) LIKE @SearchText then 2 

                END OrderBy,
                d.docid_PK,
                d.cdocname,
                d.cdeano

            FROM doctor d

            WHERE 
                (d.cdocname LIKE @SearchText
                OR d.cdeano LIKE @SearchText
                )  
        ) Doc ORDER BY OrderBy, cdocname
    END
END
Mohammad Atiour Islam
  • 5,380
  • 3
  • 43
  • 48