If you're missing information, I will attach them if requested.
Workspace
I have a database running on a MS SQL 2012 standard edition of this kind:
tables:
users(id, softId (not unique), birthdate)
- rows: 10.5 million
- indexes: all three columns, birthdate(clustered)
docs(docId, userId, creationDate, deleteDate, lastname, forename, classificationId)
rows: 23 million
indexes: lastname, forename, docId, creationDate, userID(clustered)
notice: in this specific case the names are related to the docs, not to the userId
classifications(id, description)
- rows: 200
three tables "data"
- rows: 10, 13 and 0.3 million
- indexes: docIds
relations:
users to docs: 1 to n
classifications to docs: 1 to n
docs to data-tables: 1 to n
To select the complete records I am actually on following statements:
Server-Execution-Time 16 seconds
SELECT * FROM (
select * from docs
where userID in (
select distinct userID from users where softId like '...'
)
) as doc
LEFT JOIN users on users.userID = doc.userId
LEFT JOIN classifications on classifications.id = doc.classificationId
LEFT JOIN data1 on data1.docId = doc.docId
LEFT JOIN data2 on data2.docId = doc.docId
LEFT JOIN data3 on data3.docId = doc.docId;
Updated - now 15 seconds
SELECT
docID, calssificationId, classificationDescription,
userId, softId, forename, lastname, birthdate,
data1.id, data1.date, data2.id, data2.date, data3.id, data3.date,
FROM docs
JOIN users on users.userID = doc.userId AND softId like '...'
LEFT JOIN classifications on classifications.id = doc.classificationId
LEFT JOIN data1 on data1.docId = doc.docId
LEFT JOIN data2 on data2.docId = doc.docId
LEFT JOIN data3 on data3.docId = doc.docId;
execution plans
Server-Execution-Time 17 seconds
DECLARE @userIDs table( id bigint );
DECLARE @docIDs table( id bigint );
insert into @userIDs select userID from users where softId like '...';
insert into @docIDs select docId from docs where userId in ( select id from @userIDs);
SELECT * FROM users where userID in ( select id from @userIDs);
SELECT * FROM docs where docID in (select id from @docIDs);
SELECT * FROM data1 where data1.docId in (select id from @docIDs);
SELECT * FROM data2 where data2.docId in (select id from @docIDs);
SELECT * FROM data3 where data3.docId in (select id from @docIDs);
GO
Updated - now 14 seconds
DECLARE @userIDs table( id bigint, softId varchar(12), birthdate varchar(8) );
DECLARE @docIDs table( id bigint, classification bigint, capture_date datetime, userId bigint, lastname varchar(50), forename varchar(50) );
INSERT INTO @userIDs select userID, softId, birthdate from users where softId like '...';
INSERT INTO @docIDs select docID, classification, capture_date, userID, lastname, forename from docs where userID in ( select id from @userIDs);
SELECT * FROM @userIDs;
SELECT * FROM @docIDs;
SELECT [only needed fields] FROM data1 where docID in (select id from @docIDs);
SELECT [only needed fields] FROM data2 where docID in (select id from @docIDs);
SELECT [only needed fields] FROM data3 where docID in (select id from @docIDs);
execution plans
General Updates @AntonínLejsek suggested to define the docId of documents as a clustered index and the pkId as non-clustered. This changed the execution-time as follow:
- Join-Statement: -1 second
- Multi-Select-Statement: -5 seconds
I checked the indexes again and changed the included columns, now they have the execution-time:
- Join-Statement: 4 seconds
- Multi-Select-Statement: 6 seconds
The "simple" question
Do somebody have suggestions to reduce the executiontime?