Any ideas on how to improve this query performance?
[ftsIndex]
The PK is sID, wordPos.
And there is an index on wordID, sID, wordPos.
They are all int.
In the end use a distinct.
Most sID just have a few matches.
Some sID may have over 10,000 matches and kill the query.
Have a query where the first 27,749 rows return in 11 seconds.
No single sID has more than 500 matches.
The sum of the individual matches is 65,615.
The 27,750th row alone takes over 2 minutes and has 15,000 matches.
Not a surprise as the join at the end is on [sID].
Since in the end use distinct is there a way to it to look for the first affirmative
on [wXright].[sID] = [wXleft].[sID]
and [wXright].[wordPos] > [wXleft].[wordPos]
and [wXright].[wordPos] <= [wXleft].[wordPos] + 10
then move to the next sID?
I know this is asking a lot from the query optimizer but that would really be cool.
In real life the problem document is a part lists and the supplier is repeated many times.
select distinct [wXleft].[sID]
FROM
( -- begin [wXleft]
( -- start term
select [ftsIndex].[sID], [ftsIndex].[wordPos]
from [ftsIndex] with (nolock)
where [ftsIndex].[wordID] in
(select [id] from [FTSwordDef] with (nolock)
where [word] like 'Brown')
) -- end term
) [wXleft]
join
( -- begin [wRight]
( -- start term
select [ftsIndex].[sID], [ftsIndex].[wordPos]
from [ftsIndex] with (nolock)
where [ftsIndex].[wordID] in
(select [id] from [FTSwordDef] with (nolock)
where [word] like 'Fox')
) -- end term
) [wXright]
on [wXright].[sID] = [wXleft].[sID]
and [wXright].[wordPos] > [wXleft].[wordPos]
and [wXright].[wordPos] <= [wXleft].[wordPos] + 10
This brings it down to 1:40
inner loop join
I did it just to try and it totally changed up the query plan.
I don't know how long the problem query takes. I gave up at 20:00.
I am not even going to to post this as an answer as I don't see that it would be of value to anyone else.
Hoping for a better answer.
If I don't get one in the next two days I will just delete the question.
This does not fix it
select distinct [ft1].[sID]
from [ftsIndex] as [ft1] with (nolock)
join [ftsIndex] as [ft2] with (nolock)
on [ft2].[sID] = [ft1].[sID]
and [ft1].[wordID] in (select [id] from [FTSwordDef] with (nolock) where [word] like 'brown')
and [ft2].[wordID] in (select [id] from [FTSwordDef] with (nolock) where [word] like 'fox')
and [ft2].[wordPos] > [ft1].[wordPos]
and [ft2].[wordPos] <= [ft1].[wordPos] + 10
Also support queries like "quick brown" with 10 words of "fox" or "coyote" so joins with aliases is not a good path.
This takes 14 minutes (but at least it runs).
Again this format is not conducive to more advanced queries.
IF OBJECT_ID(N'tempdb..#tempMatch1', N'U') IS NOT NULL DROP TABLE #tempMatch1
CREATE TABLE #tempMatch1(
[sID] [int] NOT NULL,
[wordPos] [int] NOT NULL,
CONSTRAINT [PK1] PRIMARY KEY CLUSTERED
(
[sID] ASC,
[wordPos] ASC
))
IF OBJECT_ID(N'tempdb..#tempMatch2', N'U') IS NOT NULL DROP TABLE #tempMatch2
CREATE TABLE #tempMatch2(
[sID] [int] NOT NULL,
[wordPos] [int] NOT NULL,
CONSTRAINT [PK2] PRIMARY KEY CLUSTERED
(
[sID] ASC,
[wordPos] ASC
))
insert into #tempMatch1
select [ftsIndex].[sID], [ftsIndex].[wordPos]
from [ftsIndex] with (nolock)
where [ftsIndex].[wordID] in
(select [id] from [FTSwordDef] with (nolock)
where [word] like 'Brown')
--and [wordPos] < 100000;
order by [ftsIndex].[sID], [ftsIndex].[wordPos]
insert into #tempMatch2
select [ftsIndex].[sID], [ftsIndex].[wordPos]
from [ftsIndex] with (nolock)
where [ftsIndex].[wordID] in
(select [id] from [FTSwordDef] with (nolock)
where [word] like 'Fox')
--and [wordPos] < 100000;
order by [ftsIndex].[sID], [ftsIndex].[wordPos]
select count(distinct(#tempMatch1.[sID]))
from #tempMatch1
join #tempMatch2
on #tempMatch2.[sID] = #tempMatch1.[sID]
and #tempMatch2.[wordPos] > #tempMatch1.[wordPos]
and #tempMatch2.[wordPos] <= #tempMatch1.[wordPos] + 10
A slightly different join runs in 5 seconds (and has a different query plan).
But I cannot fix it with hints as it moves where it does one join.
And even the +1 has over 10 documents that have over 7,000 matches.
on [wXright].[sID] = [wXleft].[sID]
and [wXright].[wordPos] = [wXleft].[wordPos] + 1
Full table def
CREATE TABLE [dbo].[FTSindex](
[sID] [int] NOT NULL,
[wordPos] [int] NOT NULL,
[wordID] [int] NOT NULL,
[charPos] [int] NOT NULL,
CONSTRAINT [PK_FTSindex] PRIMARY KEY CLUSTERED
(
[sID] ASC,
[wordPos] ASC
)WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 100) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[FTSindex] WITH CHECK ADD CONSTRAINT [FK_FTSindex_FTSwordDef] FOREIGN KEY([wordID])
REFERENCES [dbo].[FTSwordDef] ([ID])
GO
ALTER TABLE [dbo].[FTSindex] CHECK CONSTRAINT [FK_FTSindex_FTSwordDef]
GO