4

I use SQL Server 2016 Express and a Java application with JDBC driver version 4.2.

My database has a collation of Latin1_General_CI_AS (case insensitive). My table has a column of type VARCHAR(128) NOT NULL. There is an unique index on that column.

My test scenario is as follows:

After inserting 150000 strings of 48 characters length I do 200 selects of randomly chosen, existing strings. I measure the total execution time of all queries.

Then I drop the index, alter the table to change the columns collation to Latin1_General_CS_AS (case sensitive) and create the unique index again.

Then 200 selects take in total more time.

In both cases (CI and CS) the execution plans are simple and identical (search by using the index).

The query execution time not only depends on case sensitivity. With collation CS it grows faster if the strings have identical prefixes. Here are my results (execution time in seconds):

+----+---------+------------------+-------------------+-------------------+
|    + RND(48) + CONST(3)+RND(45) + CONST(10)+RND(38) + CONST(20)+RND(28) +
+----+---------+------------------+-------------------+-------------------+
| CI +       6 +                6 +                 7 +                 9 +
| CS +      10 +               20 +                45 +                78 +
+----+---------+------------------+-------------------+-------------------+

The longer the identical prefix of the random strings is the more time the case sensitive queries take.

  1. Why is the search on case insensitive column faster than on case sensitive column?
  2. What is the reason for the identical prefix behavior?
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
zuserus
  • 206
  • 2
  • 5
  • What does `search by using the index` mean? An index scan is *slow* while an index seek is fast. Post the queries and execution plans – Panagiotis Kanavos May 15 '18 at 15:51
  • BTW did you specify the new collation when you recreated the index? What does `it grows faster if the strings have identical prefixes` mean? You can't use an index to search in the middle of a field eg with `LIKE `%whatever``. Only queries that match a prefix can use an index, ie `LIKE 'whatever%'`. Does *any* of the queries use the index? – Panagiotis Kanavos May 15 '18 at 15:52

1 Answers1

0

The reason is because your SQL installation (I am guessing) was done with CI collation. This means your tempdb and master databases are using CI and currently so is your own database. Therefore, even though you changed your character column to be CS, when it is used in tempdb for sorting/merging operations, that is executed in a CI context. To get an accurate comparison, you need to change your installation collation to be CS or make these comparisons side by side on different SQL instances - one using CS and one using CI.

Lee James
  • 379
  • 1
  • 9
  • tempdb isn't used unless there are no indexes or there is so much data that the server has to store partial results. That could happen if the index is *not* used. The OP mentioned prefixes which makes me wonder whether the query uses conditions like `LIKE '%something'` – Panagiotis Kanavos May 15 '18 at 15:56