5

I have the requirement to search several different tables in my SQL Server database. And I need to sort the results based on in which table the match occurred.

The approach I've taken is shown below. However, this doesn't seem very efficient as the amount of data grows.

Can anyone suggests any tricks to optimize this?

-- Full-text query
DECLARE @FtsQuery nvarchar(100)
SET @FtsQuery = 'FORMSOF(INFLECTIONAL, detail)'

-- Maximum characters in description column
DECLARE @MaxDescription int
SET @MaxDescription = 250

SELECT 1 AS RankGroup, FTS.Rank, Id, Title, LEFT([Description], @MaxDescription) AS Description FROM Table1
    INNER JOIN CONTAINSTABLE(Table1, *, @FtsQuery) AS FTS ON FTS.[KEY] = Table1.Id
UNION SELECT 2, FTS.Rank, Id, Title, NULL FROM Table2
    INNER JOIN CONTAINSTABLE(Table2, *, @FtsQuery) AS FTS ON FTS.[KEY] = Table2.Id
UNION SELECT 3, FTS.Rank, Id, Title, LEFT([Description], @MaxDescription) FROM Table3
    INNER JOIN CONTAINSTABLE(Table3, *, @FtsQuery) AS FTS ON FTS.[KEY] = Table3.Id
UNION SELECT 4, FTS.Rank, Id, Title, LEFT([Description], @MaxDescription) FROM Table4
    INNER JOIN CONTAINSTABLE(Table4, *, @FtsQuery) AS FTS ON FTS.[KEY] = Table4.Id
UNION SELECT 5, FTS.Rank, Id, Title, LEFT([Description], @MaxDescription) FROM Table5
    INNER JOIN CONTAINSTABLE(Table5, *, @FtsQuery) AS FTS ON FTS.[KEY] = Table5.Id
ORDER BY RankGroup, Rank DESC

One idea I'd considered is to create an indexed view and then perform the search on the view. But since the view would need these UNIONs, it's hard to see how that would be any more efficient.

Jonathan Wood
  • 65,341
  • 71
  • 269
  • 466
  • One simple optimization you can do safely is replace `UNION` with `UNION ALL`. For more info: http://stackoverflow.com/questions/49925/what-is-the-difference-between-union-and-union-all – niaher Nov 13 '12 at 09:56

1 Answers1

7

This is a difficult issue, because CONTAINSTABLE can only search a single table's FTS index at a time. Your UNION solution above is fine as long as your performance is acceptable.

We faced the same issue of needing to efficiently search many columns from many tables in a single query. What we did was aggregate all of the data from these columns and tables into a single read-only table. Our query then only needed a single CONTAINSTABLE call

 CONTAINSTABLE(AggregatedTable, AggregatedColumn, @FtsQuery)

We have a scheduled job that runs every 5-10 minutes and incrementally aggregates any modified content from our source table into our single read-only aggregated content table.

In general it seems that using FTS in any reasonably-sized database and user load means you are always battling with performance. If you find that no matter what you do you cannot get the performance to be acceptable, you may need to investigate other technologies such as Lucene.

Joe Alfano
  • 10,149
  • 6
  • 29
  • 40
  • Interesting. What type of approach did you take to grab the altered data and push it to the aggregate table? And how long does that scheduled job normally take to run? – mg1075 Apr 12 '13 at 03:40
  • 1
    We use a stored proc that is scheduled by the sql agent to update the denormalized table. Our base data has a datetime stamp on each row indicating when the row was last touched (inserted or updated) so we use that to just do an incremental population every 5 or 10 minutes. Our initial load took an hour or so, but the incremental populations take only 10 or 20 seconds. – Joe Alfano Apr 12 '13 at 23:47
  • Thanks for the input. I suppose the only issues of concern there are: (1) the time it takes to query the tables to determine the records qualifying for an update, and (2) if there was any downtime and an update that normally would have run - and looked at a records only so far back in time - failed to run. – mg1075 Apr 13 '13 at 00:03