4

Ok before I start I want to say, I am totally new to this Sql indexing thing.

I have a table that is not join to anything. It has the following columns:

Id (int)
String1 (nvarchar(10)
String2 (nvarchar(50)
DateTime1 (date)
DateTime2 (date)
DateTime3 (date)

I have about 100,000,000 of rows on that table. And doing a search on it is very slow, so I figure I have to add some indexes.

I will ONLY run the following queries:

Query 1

select * from Table 
where String1 = "Blah" and 
String 2 = "Blah" and 
DateTime1 <= {someTime1} and 
DateTime2 >= {someTime1}

Query 2

select * from Table 
where String1 = "Blah" and 
String 2 = "Blah" and 
DateTime2 >= {someTime1}

Query 3

select * from Table 
where String1 = "Blah" and 
String 2 = "Blah" and 
DateTime3 >= {someTime1}

Notice that they are pretty much the same query except they have a slightly different date comparison. Also, sorting is not an issue.

So I tried adding a non-clustered index on the column String1, String2, DateTime1, DateTime2. Running Query 1 here's what I see:

  1. It is a lot faster now, but still takes about 20 sec to load.
  2. I notice that for the same exact same query (with the same search parameters) if I call it again it will return data in less than a second.
  3. I notice that if I run query 1 with some other parameters, it will again take 20 secs to load.
  4. I notice that my RAM goes up and stays up after the query.

So here are my questions:

  1. Am I doing this right? Why does it take 20 secs to load? Shouldn't it be very fast after I added index?
  2. What is Sql server doing with my RAM? Do I need more RAM because I have a big table?
  3. Do I need to add new indexes for query 2 and query 3? Or is the index that I added already good enough for the other 2 queries?

Thanks,

Chi

Chi Chan
  • 11,890
  • 9
  • 34
  • 52
  • You know, I've read a lot of SQL questions. This is a pretty well asked question. Kudos. – Matt Nov 02 '10 at 22:22

5 Answers5

2
  1. Parsing and compiling the query, and maybe reading from disk. This is why it runs quick the 2nd time. And takes time to compile again when parameters change.

  2. Data cache. aka Buffer pool. More RAM won't go wrong with SQL Server generally, ever.

  3. Query 1 and 2 are the same, query 3 is different.

I would suggest 2 indexes to start with

  • String1, String 2, DateTime2, DateTime1 INCLUDE DateTime3
  • String1, String 2, DateTime3 INCLUDE DateTime2, DateTime1

Other thoughts regards datatypes... the smalle the better if course

Edit:

Disk reads will happen one into memory (simply) so more RAM will help However, I suspect the 20 secodns is compile + statistics etc, not reading from disk

Community
  • 1
  • 1
gbn
  • 422,506
  • 82
  • 585
  • 676
  • Thanks gbn. I am selecting around 2k rows and is disk read what's taking the time? Are there any suggestions as to how to make it faster? – Chi Chan Nov 02 '10 at 17:59
  • one more question, so there any methods to reduce the compile and statistic time? Is moving those into stored proc going to help? – Chi Chan Nov 02 '10 at 18:13
  • @Chi Chan: It can do. You are more likely to get plan re-use than with hard coded parameters – gbn Nov 02 '10 at 18:15
  • This is a good answer. I would like to add: You will not get plan re-use in plain ol' SQL Server if the query text is different, even by a comment. `SELECT ... WHERE String1 = 'Blah'` won't use the cached plan for `SELECT ... WHERE String1 = 'Foo'`. You'd need `SELECT ... WHERE String1 = @String1`. – Matt Nov 02 '10 at 22:31
  • Also, after so many inserts/deletes/etc. SQL Server will decide your statistics aren't accurate and will recompile them on your next query. So, if you are always doing a `SELECT` after a huge `INSERT`, you may always have to pay the compile penalty, unfo. – Matt Nov 02 '10 at 22:32
  • @Matt: good points, thanks. Plan re-use also depends on whitespace and case too. – gbn Nov 03 '10 at 05:12
0

I'd recommend familiarizing yourself with SQL Server's Database Tuning Advisor (DTA) and Profiler.

Here is a good article:

http://www.zimbio.com/SQL/articles/655/How+Tune+Database+Using+Database+Tuning+Advisor

DTA won't always give you perfect recommendations but it's usually a good start. If you use profiler you can monitor your database for a day and then pass that to the DTA.

There are also other things to take into account such as:

  • How often will specific queries be run
  • How often will the tables have records inserted or updated

Keep in mind that having indexes will slow your insert's and updates.

Another important thing to do is make sure you are starting from the same baseline with every query you run. I suspect you are seeing the results of 1 and 2 in your question because you have cached data. Here is a good link that goes over how test sql scripts.

Abe Miessler
  • 82,532
  • 99
  • 305
  • 486
0

The problem with query 1 is that it has two independent range conditions. There was a very similar question just recently. I explained that particular issue there: How Database stores data internally in B-Tree/B+Tree

regarding query 2:

if you would use the following index:

String1, String2, Date2, Date 1

it can serve query2 quite good, and doesn't change query1 (except the condition on date1 is much more selective than date2).

query3 might need an additional index:

String1, String2, Date3

However, I don't like two indexes with the same prefix. I would probably turn String1 and String2 around--just in case some query has String2 only.


Want better understanding of all this indexing stuff? have a look at my free eBook "Use The Index, Luke"

Community
  • 1
  • 1
Markus Winand
  • 8,371
  • 1
  • 35
  • 44
0
  1. Am I doing this right? Why does it take 20 secs to load? Shouldn't it be very fast after I added index?

-> I think you are doing just fine but I would also see the data selectivity in your query. How many records will by returned in your typical query out of 100,000,000? If your resultset is small(3~5% of entire table), indexing is good resolution.

  1. What is Sql server doing with my RAM? Do I need more RAM because I have a big table? -> DBMS is moving data(block by block) from physical storage to your RAM to execute your query. Also your query needs to parsing and it would consume some memory.
exiter2000
  • 548
  • 5
  • 14
0

More RAM will help depending on your operating system SQL Server is installed on, the version of SQL Server and how much RAM you already have. They all don't have the same limits on RAM.

For future reference use the Analyze Query in Database Tuning Advisor in SQL Server Management Studio (SSMS). You may not need it now, but could help when the tables get more complicated.

You do have a primary key on this table? Just curious, you never mentioned it.

JeffO
  • 7,957
  • 3
  • 44
  • 53
  • The Id is the p.Key, it is however, a random int assigned using hi-lo with nhibernate. It's safe to say that I will never use the key to select/update/delete anything. – Chi Chan Nov 02 '10 at 18:09