3

I'm having a number of slow prefetch queries in LLBL. Here's a simplified version of the generated SQL:

SELECT DISTINCT 
    Column1
FROM 
    Table1
WHERE 
Table1.Table2ID IN 
(
    SELECT Table2.Table2ID AS Table2ID 
    FROM 
        Table2  
        INNER JOIN Table1 ON  Table2.Table2ID=Table1.Table2ID
        INNER JOIN 
        (
            SELECT DISTINCT 
                Table1.Table2ID AS Table2ID, 
                MAX(Table1.EffectiveDate) AS EffectiveDate 
            FROM Table1  
            WHERE Table1.EffectiveDate <= '2012-01-03 00:00:00:000'
            GROUP BY Table1.Table2ID
        ) MaxEffective  
        ON  
            MaxEffective.Table2ID = Table1.Table2ID 
            AND MaxEffective.EffectiveDate = Table1.EffectiveDate
)

What I'm finding is that the subquery executes fast and if I replace that subquery with the actual results, the outer query is fast. But together, they are slow.

I have ran the Database Engine Tuning Adviser which helped a bit, but it's still quite slow.

I'm not very skilled in understanding the execution plans, but it appears the vast majority of time is spent doing an index seek on Table1.

I expected this to run faster since it's a non-correlated subquery. Is there something I'm just not seeing?

If it were just straight SQL, I'd rewrite the query and do a join, but I'm pretty much stuck with LLBL. Are there any settings I can use to force it to do a join? Is there a reason SQL Server isn't generating the same execution plan as it does for a join?

Edit for actual query...

SELECT DISTINCT 
    ResidentialComponentValues.ResidentialComponentValueID AS ResidentialComponentValueId, 
    ResidentialComponentValues.ResidentialComponentTypeID AS ResidentialComponentTypeId, 
    ResidentialComponentValues.Value, 
    ResidentialComponentValues.Story, 
    ResidentialComponentValues.LastUpdated, 
    ResidentialComponentValues.LastUpdatedBy, 
    ResidentialComponentValues.ConcurrencyTimestamp, 
    ResidentialComponentValues.EffectiveDate, 
    ResidentialComponentValues.DefaultQuantity 
FROM 
ResidentialComponentValues  
WHERE 
ResidentialComponentValues.ResidentialComponentTypeID IN 
(
    SELECT ResidentialComponentTypes.ResidentialComponentTypeID AS ResidentialComponentTypeId 
    FROM 
        ResidentialComponentTypes  INNER JOIN ResidentialComponentValues  
        ON  ResidentialComponentTypes.ResidentialComponentTypeID=ResidentialComponentValues.ResidentialComponentTypeID
        INNER JOIN 
        (
            SELECT DISTINCT 
                ResidentialComponentValues.ResidentialComponentTypeID AS ResidentialComponentTypeId, 
                MAX(ResidentialComponentValues.EffectiveDate) AS EffectiveDate 
            FROM ResidentialComponentValues  
            WHERE ResidentialComponentValues.EffectiveDate <= '2012-01-03 00:00:00:000'
            GROUP BY ResidentialComponentValues.ResidentialComponentTypeID
        ) LPA_E1  
        ON  
            LPA_E1.ResidentialComponentTypeId = ResidentialComponentValues.ResidentialComponentTypeID 
            AND LPA_E1.EffectiveDate = ResidentialComponentValues.EffectiveDate
)

Edit for create statements:

/****** Object:  Table [dbo].[ResidentialComponentTypes]    Script Date: 01/03/2012 13:49:06 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[ResidentialComponentTypes](
    [ResidentialComponentTypeID] [int] IDENTITY(1,1) NOT NULL,
    [ComponentTypeName] [varchar](255) NOT NULL,
    [LastUpdated] [datetime] NOT NULL,
    [LastUpdatedBy] [varchar](50) NOT NULL,
    [ConcurrencyTimestamp] [timestamp] NOT NULL,
    [Active] [bit] NOT NULL,
 CONSTRAINT [PK_ResidentialComponentTypes] PRIMARY KEY CLUSTERED 
(
    [ResidentialComponentTypeID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
/****** Object:  Table [dbo].[ResidentialComponentValues]    Script Date: 01/03/2012 13:49:06 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[ResidentialComponentValues](
    [ResidentialComponentValueID] [int] IDENTITY(1,1) NOT NULL,
    [ResidentialComponentTypeID] [int] NOT NULL,
    [Value] [decimal](18, 3) NOT NULL,
    [Story] [varchar](255) NOT NULL,
    [LastUpdated] [datetime] NOT NULL,
    [LastUpdatedBy] [varchar](50) NOT NULL,
    [ConcurrencyTimestamp] [timestamp] NOT NULL,
    [EffectiveDate] [datetime] NOT NULL,
    [DefaultQuantity] [int] NOT NULL,
 CONSTRAINT [PK_ResidentialComponentPrices] PRIMARY KEY CLUSTERED 
(
    [ResidentialComponentValueID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
CREATE NONCLUSTERED INDEX [_dta_index_ResidentialComponentValues_71_56543435__K1] ON [dbo].[ResidentialComponentValues] 
(
    [ResidentialComponentValueID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [_dta_index_ResidentialComponentValues_71_56543435__K1_2_3_4_5_6_7_8_9] ON [dbo].[ResidentialComponentValues] 
(
    [ResidentialComponentValueID] ASC
)
INCLUDE ( [ResidentialComponentTypeID],
[Value],
[Story],
[LastUpdated],
[LastUpdatedBy],
[ConcurrencyTimestamp],
[EffectiveDate],
[DefaultQuantity]) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [_dta_index_ResidentialComponentValues_71_56543435__K2_K1] ON [dbo].[ResidentialComponentValues] 
(
    [ResidentialComponentTypeID] ASC,
    [ResidentialComponentValueID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [_dta_index_ResidentialComponentValues_71_56543435__K2_K8_K1] ON [dbo].[ResidentialComponentValues] 
(
    [ResidentialComponentTypeID] ASC,
    [EffectiveDate] ASC,
    [ResidentialComponentValueID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [_dta_index_ResidentialComponentValues_71_56543435__K2_K8_K1_3_4_5_6_7_9] ON [dbo].[ResidentialComponentValues] 
(
    [ResidentialComponentTypeID] ASC,
    [EffectiveDate] ASC,
    [ResidentialComponentValueID] ASC
)
INCLUDE ( [Value],
[Story],
[LastUpdated],
[LastUpdatedBy],
[ConcurrencyTimestamp],
[DefaultQuantity]) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
/****** Object:  ForeignKey [FK_ResidentialComponentValues_ResidentialComponentTypes]    Script Date: 01/03/2012 13:49:06 ******/
ALTER TABLE [dbo].[ResidentialComponentValues]  WITH CHECK ADD  CONSTRAINT [FK_ResidentialComponentValues_ResidentialComponentTypes] FOREIGN KEY([ResidentialComponentTypeID])
REFERENCES [dbo].[ResidentialComponentTypes] ([ResidentialComponentTypeID])
GO
ALTER TABLE [dbo].[ResidentialComponentValues] CHECK CONSTRAINT [FK_ResidentialComponentValues_ResidentialComponentTypes]
GO

enter image description here

Dan
  • 533
  • 8
  • 29
  • 1
    I know you said this is simplified, but it's possible that you have a duplicate alias between the inner and outer queries. If you do, then the engine might be trying to do this as a correlated subquery which could be much slower. – JNK Jan 03 '12 at 21:04
  • They don't have aliases, but maybe referencing the table name directly is doing the same thing. I'll try adding an alias to Table1 in the subquery and see if that helps. – Dan Jan 03 '12 at 21:09
  • @JNK - This would break all the rules on *scope*, I can't imagine that this is really the issue. – MatBailie Jan 03 '12 at 21:09
  • @Dems - I've had it happen before on a query that aliased an outer table and then referred to the alias in a subquery (where it should be using the table itself, not the outer reference). – JNK Jan 03 '12 at 21:12
  • I added aliases, but it didn't help. – Dan Jan 03 '12 at 21:12
  • @Dan - can you add the actual query? – JNK Jan 03 '12 at 21:13
  • Can you add the create table statements including indexes and foreign keys? Also can you add the execution plan XML? – Dave Barker Jan 03 '12 at 21:44
  • Not sure how to get the execution plan posted. It would push my post over the 30k character limit. – Dan Jan 03 '12 at 21:55
  • 1
    @Dan - How about a picture of the graphical execution plan? That might help. – Tony Jan 03 '12 at 21:58
  • This is strange... The first time I tried aliases, it didn't help, but now it seems to be. I can't seem to figure out what I did wrong the first time though. – Dan Jan 03 '12 at 23:59

2 Answers2

0

The inner subquery does not need the DISTINCT as you already GROUP BY ResidentialComponentTypeID:

    (
        SELECT DISTINCT 
            ResidentialComponentValues.ResidentialComponentTypeID 
              AS ResidentialComponentTypeId, 
            MAX(ResidentialComponentValues.EffectiveDate) 
              AS EffectiveDate 
        FROM ResidentialComponentValues 
        WHERE ResidentialComponentValues.EffectiveDate 
              <= '2012-01-03 00:00:00:000'
        GROUP BY ResidentialComponentValues.ResidentialComponentTypeID
    ) LPA_E1 

Not sure if SQL-Server will recognize this and optimize but you can rewrite to be sure:

    (
        SELECT 
            rcv.ResidentialComponentTypeID 
            MAX(rcv.EffectiveDate) AS EffectiveDate 
        FROM ResidentialComponentValues  AS rcv
        WHERE rcv.EffectiveDate 
              <= '2012-01-03 00:00:00:000'
        GROUP BY rcv.ResidentialComponentTypeID
    ) LPA_E1 

And if I am not wrong, you also need neither the other DISTINCT in the query nor the extra subquery nesting. Check if this rewrite gives the same results:

SELECT 
    v.ResidentialComponentValueID, 
    v.ResidentialComponentTypeID, 
    v.Value, 
    v.Story, 
    v.LastUpdated, 
    v.LastUpdatedBy, 
    v.ConcurrencyTimestamp, 
    v.EffectiveDate, 
    v.DefaultQuantity 
FROM 
        ResidentialComponentTypes  AS t
    INNER JOIN ResidentialComponentValues  AS v
        ON  t.ResidentialComponentTypeID=v.ResidentialComponentTypeID
    INNER JOIN 
        (
            SELECT 
                rcv.ResidentialComponentTypeID 
                MAX(rcv.EffectiveDate) AS EffectiveDate 
            FROM ResidentialComponentValues  AS rcv
            WHERE rcv.EffectiveDate 
                  <= '2012-01-03 00:00:00:000'
            GROUP BY rcv.ResidentialComponentTypeID
        ) LPA_E1 
        ON  
            LPA_E1.ResidentialComponentTypeId = v.ResidentialComponentTypeID 
            AND LPA_E1.EffectiveDate = v.EffectiveDate

You also do not need to join the ResidentialComponentTypes as there is a Foreign Key constraint from the ResidentialComponentValues to it but perhaps you have that join to be used in other reports.


No idea how that would be done in LLBL but if you can remove any of the DISTINCT from the code generated - especially the first one - or the extra nesting (or the extra join), it will probably help the confused optimizer.

ypercubeᵀᴹ
  • 113,259
  • 19
  • 174
  • 235
  • I think LLBL is adding the distinct, but I'll check to see if I'm adding it somehow. Doesn't look like your rewrite helped. – Dan Jan 03 '12 at 22:17
  • @Dan: Can you check how LLBL sees the relationship between the two tables? Is it `1-to-many` ? – ypercubeᵀᴹ Jan 03 '12 at 23:24
0

It's not clear to me from reading your queries what you're actually trying to achieve. Is your outer query trying to select only the most recently effective ResidentialComponentValues records for each ResidentialComponentType?

The DISTINCT on the inner-most query seems unnecessary and may cause the database some difficulty in optimizing the query. You're only selecting 2 columns, and you are grouping by one and aggregating the other, therefore I'm sure that the results will already be distinct. You're not helping the database execute this query more efficiently by specifying DISTINCT, though perhaps the query optimizer would ignore it.

Similarly, the first INNER JOIN to ResidentialComponentValues on the inner query seems like it is unnecessary.

The ON condition for your second INNER JOIN in your subquery (shown below) confuses me. It seems like this is simply joining your LPA_E1 result with the ResidentialComponentValues table from the first INNER JOIN in your subquery, but I think what you're really trying to do is join it with the ResidentialComponentValues table from the outer query.

ON  
    LPA_E1.ResidentialComponentTypeId = ResidentialComponentValues.ResidentialComponentTypeID 
    AND LPA_E1.EffectiveDate = ResidentialComponentValues.EffectiveDate

My guess is that below is the query that you really want, though I don't think that it produces the same results as your original. This selects only the most recently effective ResidentialComponentValue records for each ResidentialComponentType.

declare @endDate datetime
set @endDate = '2012-01-03 00:00:00:000'

SELECT
    ResidentialComponentValues.ResidentialComponentValueID AS ResidentialComponentValueId, 
    ResidentialComponentValues.ResidentialComponentTypeID AS ResidentialComponentTypeId, 
    ResidentialComponentValues.Value, 
    ResidentialComponentValues.Story, 
    ResidentialComponentValues.LastUpdated, 
    ResidentialComponentValues.LastUpdatedBy, 
    ResidentialComponentValues.ConcurrencyTimestamp, 
    ResidentialComponentValues.EffectiveDate, 
    ResidentialComponentValues.DefaultQuantity 
FROM 
    ResidentialComponentValues  
WHERE
    -- the effective date for this ResidentialComponentValue record has already passed
    ResidentialComponentValues.EffectiveDate <= @endDate
    -- and there does not exist any other ResidentialComponentValue record for the same ResidentialComponentType that is effective more recently
    and not exists (
        select 1
        from ResidentialComponentValues LPA_E1
        where
            LPA_E1.ResidentialComponentTypeID = ResidentialComponentValues.ResidentialComponentTypeID
            and LPA_E1.EffectiveDate <= @endDate
            and LPA_E1.EffectiveDate > ResidentialComponentValues.EffectiveDate
    )

Side Note: My guess is that this query would benefit from a 2-column index on the ResidentialComponentValues table for columns (ResidentialComponentTypeID, EffectiveDate).


Additionally, I think that this query shown below will probably produce the same results as your original, and my guess is that it will execute faster.
SELECT
    ResidentialComponentValues.ResidentialComponentValueID AS ResidentialComponentValueId, 
    ResidentialComponentValues.ResidentialComponentTypeID AS ResidentialComponentTypeId, 
    ResidentialComponentValues.Value, 
    ResidentialComponentValues.Story, 
    ResidentialComponentValues.LastUpdated, 
    ResidentialComponentValues.LastUpdatedBy, 
    ResidentialComponentValues.ConcurrencyTimestamp, 
    ResidentialComponentValues.EffectiveDate, 
    ResidentialComponentValues.DefaultQuantity 
FROM 
    ResidentialComponentValues  
WHERE
    -- show any ResidentialComponentValue records where there is any other currently effective ResidentialComponentValue record for the same ResidentialComponentType
    exists (
        select 1
        from ResidentialComponentValues LPA_E1
        where
            LPA_E1.ResidentialComponentTypeID = ResidentialComponentValues.ResidentialComponentTypeID
            and LPA_E1.EffectiveDate <= @endDate
    )


Given the following test data, the first query returns records 2 and 4. The second query returns records 1, 2, 3, 4, and 5.
insert into ResidentialComponentTypes values (1)
insert into ResidentialComponentTypes values (2)
insert into ResidentialComponentTypes values (3)

insert into ResidentialComponentValues (ResidentialComponentValueID, ResidentialComponentTypeID, Value, Story, LastUpdated, LastUpdatedBy, EffectiveDate, DefaultQuantity)
          select 1, 1, 'One',   'Blah', getdate(), 'Blah', '2012-01-01', 1
union all select 2, 1, 'Two',   'Blah', getdate(), 'Blah', '2012-01-02', 1
union all select 3, 1, 'Three', 'Blah', getdate(), 'Blah', '2012-01-04', 1
union all select 4, 2, 'Four',  'Blah', getdate(), 'Blah', '2012-01-02', 1
union all select 5, 2, 'Five',  'Blah', getdate(), 'Blah', '2012-01-04', 1
union all select 6, 3, 'Six',   'Blah', getdate(), 'Blah', '2012-01-04', 1
Dr. Wily's Apprentice
  • 10,212
  • 1
  • 25
  • 27
  • I'll look through your queries and see if I can determine whether they do the same thing. What I was intending to accomplish is to get all ResidentialComponentValues for each ResidentialComponentType that have the most recent effective date for that ResidentialComponentType. It's been awhile since I worked on this, but I think part of the query was making sure the ResidentialComponentType was also effective as of that date. In other words, I don't want ones created after that date. – Dan Jan 03 '12 at 22:22
  • @Dan - Thanks, that helps clarify things. I *think* that the first SELECT query in my answer provides what you're after. Get the most recently effective ResidentialComponentValues for each ResidentialComponentType, but exclude any that are only effective after the specified date. – Dr. Wily's Apprentice Jan 03 '12 at 22:40
  • He already has an index on `(ResidentialComponentTypeID, EffectiveDate)` – ypercubeᵀᴹ Jan 03 '12 at 22:41
  • Ah, good then. I didn't see the updated question with the DDL. – Dr. Wily's Apprentice Jan 03 '12 at 22:50
  • The second query seems to return the same results with my test data. I haven't tried the first query yet. I'm not sure what I'm going to end up doing to make LLBL work though. – Dan Jan 03 '12 at 23:23
  • @Dan: I don't see anywhere in your query the `ResidentialComponentTypes.Active` or the `ResidentialComponentTypes.LastUpdated` to be used. So, how is it supposed to check that *"the ResidentialComponentType was also effective as of that date"* ? – ypercubeᵀᴹ Jan 03 '12 at 23:28
  • I think I don't necessarily want a check on Active in my case. LastUpdated is for other purposes. Here's what is actually happening: I'm querying to pull back ResidentialComponentTypes if they have any ResidentialComponentValues effective on or before the date. I am prefetching ResidentialComponentValues along with it, which generates the query I'm having problems with. – Dan Jan 03 '12 at 23:42
  • Someone had mentioned adding aliases. My initial attempt didn't seem to help. Maybe I didn't have something quite right. I tried it again and fixed the performance issue. I put the alias on ResidentialComponentValues where I was joining to ResidentialComponentTypes. But then I was having problems getting LLBL to accept the alias, so I rethought whether I needed the join to ResidentialComponentValues at all, and I think I don't. – Dan Jan 04 '12 at 17:25
  • @Dan - I agree, the join to ResidentialComponentTypes seems incorrect, given how you've described the result that you're trying to get. It may also be causing your performance concerns. – Dr. Wily's Apprentice Jan 04 '12 at 17:37