2

I have read the theory and views behind SQL Server 2008's "OPTIMIZE FOR UNKNOWN" query plan option. I understand what it does well enough.

I did some limited experiments and found that with a warm cache, it only was of benefit on > 100k rows. However, this was on a simple table and query with no joins, filtering, etc. On a cold cache, the picture would undoubtedly be much more in its favor.

I don't currently have a production system to bench the before/after. So I am curious if anyone has done before/after testing and made any useful discoveries as to exactly when to use this option and when not to.

UPDATE:

I created a table with 3 cols, a PK on the UID, and an index on Col2 (int). All processing was against Col2. Indicated are number of rows and time (DATEDIFF * 1000000):

Type        1,000   5,000   20,000  100,000
Normal      0.3086  6.327   26.427  144.83, 141.126
Recompile           117.59  584.837 
For Unknown 0.8101  6.52    26.89   143.788, 143.248
IamIC
  • 17,747
  • 20
  • 91
  • 154

1 Answers1

0

You would use it when your data is sufficiently skewed that a plan generated with one parameter value would be completely unsuitable for another potential value of the parameter. i.e. to resolve a parameter sniffing issue.

The remainder of your question doesn't seem particularly relevant to the purpose of the hint or answerable IMO.

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • I could argue that a plan made by "@UID = 1", which would cause an index scan, not seek, would always be bad for "@UID = 100000". MSDN's blogs are highly vague on any best practices, giving only what you're saying here. The remainder of the question is highlighting a scenario that exacerbates the situation, so IMO it is very relevant. – IamIC Dec 07 '10 at 12:24
  • @IanC - It is completely unanswerable as written. The number of rows means nothing on its own without the context of an execution plan. – Martin Smith Dec 07 '10 at 12:28
  • I understand that. All I am asking for are some case studies. This is much like the "does hyperthreading hurt or help SQL Server" question. There is no "yes/no" answer. The only way to find any type of answer is to study some case studies and deduce patterns. This was done for this scenario and a conclusion was reached. It could be done for what I am asking too. – IamIC Dec 07 '10 at 12:33
  • @IanC - I personally have only used it to date when I've identified a specific parameter sniffing issue. I suppose in the cases where I don't use it I'm implicitly assuming that **any** parameter value will do so `OPTIMIZE FOR UNKNOWN` wouldn't do any harm there either. – Martin Smith Dec 07 '10 at 12:56
  • @Martin I'm making the same assumptions. Knowing something about the difference between a query plan ordering a scan vs. a seek makes me nervous. However, I have found that using this hint in the wrong place is almost 3 times slower for the test case I give (see edit). So it doesn't seem a good idea to "just use it". – IamIC Dec 07 '10 at 13:51
  • @IanC - But if you don't use it you are leaving it up to chance what parameter value will be used for compilation anyway. (unless `optimize for` optimizes for a value that exists in the data but is never called in practice) so maybe there's an argument for saying that every parameterised statement should either have `OPTIMIZE FOR UNKNOWN`, `OPTIMIZE FOR SOME SPECIFIC VALUE`, or `WITH RECOMPILE` (I don't actually do this!) – Martin Smith Dec 07 '10 at 13:57
  • @Martin I know!! That's my point. However, per the mini benchmarks I did, the overhead of the hint outweighs the advantages unless the rowcount is high (hence my original question). If we were selecting @CountryID, it would be utterly pointless to use the hint. Even @CustoemrID where we only have 10k customers would be futile, despite an inefficient execution plan. However, as the complexity of the query rises, I suspect the rowcount advantage threshold drops. Note I am using this "rowcount" idea loosely. – IamIC Dec 07 '10 at 14:00