0

I have a query which is returning the below explain:

id  select_type  table  type         possible_keys                                                                                        key                                                      key_len  ref  rows  Extra                                                                                  
1   SIMPLE       this_  index_merge  CategoryId,Deleted,PublishedOn,_ComputedDeletedValue,_Temporary_Flag,Published,_TestItemSessionGuid  Deleted,_ComputedDeletedValue,_Temporary_Flag,Published  1,1,1,1       6203  Using intersect(Deleted,_ComputedDeletedValue,_Temporary_Flag,Published); Using where  

Does this show that the query is using indexes everywhere, or can it be improved by adding other indexes? Based on this: http://dev.mysql.com/doc/refman/5.1/en/explain-output.html, it mentions that the key column shows the indexes actually used. I have an index on every column.

The query in question is the below:

SELECT SQL_NO_CACHE count(*) FROM 

article this_ 


WHERE 


(this_._Temporary_Flag = 0 OR this_._Temporary_Flag = NULL) AND 
this_.Published = 1 AND 
(this_.PublishedOn IS NULL OR this_.PublishedOn <= '2012-10-30 18:46:18 ') AND 
(this_.Deleted = 0 OR this_.Deleted = NULL) AND 
(this_._ComputedDeletedValue = 0 OR this_._ComputedDeletedValue = NULL) AND
((this_._TestItemSessionGuid IS NULL OR this_._TestItemSessionGuid = ''))

 AND NOT (this_.CategoryId IS NULL) 

Table has approximately 140,000 records. This query is taking 3 seconds to execute, and returning 135,725 as result.

Karl Cassar
  • 6,043
  • 10
  • 47
  • 84
  • in where clause: "Temporary_Flag = NULL)" <- fail. But you do it right on other occassions. – fancyPants Oct 31 '12 at 10:24
  • @tombom Thanks for pointing that out. I've since removed the 'Is NULL' check as it was useless, as the field is a boolean and does not even except `NULL` values! – Karl Cassar Oct 31 '12 at 12:12

1 Answers1

1

The explain shows that MySQL is using an index merged from 4 separate indexes with key length 1,1,1,1 which means all of the four columns are used for traversing the search tree.

However having separate index on all the columns are usually not the most efficient way. Especially in your case merging four index can take a lot of time. The actual execution might be faster but building the index may take 1-2 seconds.

I would suggest to build a composite index on those columns. The order of those matters. Get the ones with equal conditions and put them in order of cardinality (greater cardinality goes first). The last column will be the range query (in your case the PublishedOn).

For example:

create index my_query_IDX on article (Deleted, _Temporary_Flag, _ComputedDeletedValue, PublishedOn)

Another thing I would suggest is to change the _Temporary_Flag, Deleted, _ComputedDeletedValue, _ Published etc columns to be NOT NULL DEFAULT '0'. Indexes on nullable columns and null values are less effective than not null columns and as I saw according to the key_length these columns are BOOLEANS or TINYINT (which are the same by the way).

Károly Nagy
  • 1,734
  • 10
  • 13
  • This may be stupid, but what exactly do you mean by 'order of cardinality'? Would you suggest altering the original query, or adding just indexes? The chunk before `AND NOT (this_.CategoryId IS NULL) ` is like a 'standard' part of querying as those fields are common in all tables, and I cannot easily modify that part of the query, preferably. – Karl Cassar Oct 31 '12 at 12:08
  • @KarlCassar Having an index on boolean column I wouldn't recommend. Please have a look at my answer to another question. http://stackoverflow.com/questions/12296258/mysql-query-optimization-of-like-term-order-by-int/12296810#12296810 Apart from that, I'd agree with Károly Nagy. +1 – fancyPants Oct 31 '12 at 12:22
  • By 'order of cardinality' I ment bigger the variations of the possible values bigger the cardinality is. For example a Boolean has a cardinality of 2 (true or false) which is not very effective. In most cases MySQL will do full scan search anyway even if you have an index on it. In your case it might work. Be aware that having good indexes is highly depend on the data itself not just the schema. I love the example of movie databases: having an index on the movie title is very good if you have a `where title like "Scarf%"` but won't be used for `where title like "The%"`. – Károly Nagy Oct 31 '12 at 14:15
  • I created the index specified by Károly Nagy, and it greatly improved the search, reducing it to 0.08 seconds for most queries. However, I still cannot understand why that selection, and for example why 'Published' wasn't added but 'Deleted' was. – Karl Cassar Nov 01 '12 at 14:32
  • The answer is the data distribution in the database. If 80% of your articles are published there's not much benefit mysql can get by using an index on it. In this case full scan is cheaper than traversing the b+tree. Optimizing indexes is about datas as well not just schema. – Károly Nagy Nov 08 '12 at 10:17