22

My situation:

  • the query searches around 90,000 vehicles
  • the query takes long each time
  • I already have indexes on all the fields being JOINed.

How can I optimise it?

Here is the query:

SELECT vehicles.make_id,
       vehicles.fuel_id,
       vehicles.body_id,
       vehicles.transmission_id,
       vehicles.colour_id,
       vehicles.mileage,
       vehicles.vehicle_year,
       vehicles.engine_size,
       vehicles.trade_or_private,
       vehicles.doors,
       vehicles.model_id,
       Round(3959 * Acos(Cos(Radians(51.465436)) *
                         Cos(Radians(vehicles.gps_lat)) *
                                           Cos(
                                           Radians(vehicles.gps_lon) - Radians(
                                           -0.296482)) +
                               Sin(
                                      Radians(51.465436)) * Sin(
                               Radians(vehicles.gps_lat)))) AS distance
FROM   vehicles
       INNER JOIN vehicles_makes
         ON vehicles.make_id = vehicles_makes.id
       LEFT JOIN vehicles_models
         ON vehicles.model_id = vehicles_models.id
       LEFT JOIN vehicles_fuel
         ON vehicles.fuel_id = vehicles_fuel.id
       LEFT JOIN vehicles_transmissions
         ON vehicles.transmission_id = vehicles_transmissions.id
       LEFT JOIN vehicles_axles
         ON vehicles.axle_id = vehicles_axles.id
       LEFT JOIN vehicles_sub_years
         ON vehicles.sub_year_id = vehicles_sub_years.id
       INNER JOIN members
         ON vehicles.member_id = members.id
       LEFT JOIN vehicles_categories
         ON vehicles.category_id = vehicles_categories.id
WHERE  vehicles.status = 1
       AND vehicles.date_from < 1330349235
       AND vehicles.date_to > 1330349235
       AND vehicles.type_id = 1
       AND ( vehicles.price >= 0
             AND vehicles.price <= 1000000 )  

Here is the vehicle table schema:

CREATE TABLE IF NOT EXISTS `vehicles` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `number_plate` varchar(100) NOT NULL,
  `type_id` int(11) NOT NULL,
  `make_id` int(11) NOT NULL,
  `model_id` int(11) NOT NULL,
  `model_sub_type` varchar(250) NOT NULL,
  `engine_size` decimal(12,1) NOT NULL,
  `vehicle_year` int(11) NOT NULL,
  `sub_year_id` int(11) NOT NULL,
  `mileage` int(11) NOT NULL,
  `fuel_id` int(11) NOT NULL,
  `transmission_id` int(11) NOT NULL,
  `price` decimal(12,2) NOT NULL,
  `trade_or_private` tinyint(4) NOT NULL,
  `postcode` varchar(25) NOT NULL,
  `gps_lat` varchar(50) NOT NULL,
  `gps_lon` varchar(50) NOT NULL,
  `img1` varchar(100) NOT NULL,
  `img2` varchar(100) NOT NULL,
  `img3` varchar(100) NOT NULL,
  `img4` varchar(100) NOT NULL,
  `img5` varchar(100) NOT NULL,
  `img6` varchar(100) NOT NULL,
  `img7` varchar(100) NOT NULL,
  `img8` varchar(100) NOT NULL,
  `img9` varchar(100) NOT NULL,
  `img10` varchar(100) NOT NULL,
  `is_featured` tinyint(4) NOT NULL,
  `body_id` int(11) NOT NULL,
  `colour_id` int(11) NOT NULL,
  `doors` tinyint(4) NOT NULL,
  `axle_id` int(11) NOT NULL,
  `category_id` int(11) NOT NULL,
  `contents` text NOT NULL,
  `date_created` int(11) NOT NULL,
  `date_edited` int(11) NOT NULL,
  `date_from` int(11) NOT NULL,
  `date_to` int(11) NOT NULL,
  `member_id` int(11) NOT NULL,
  `inactive_id` int(11) NOT NULL,
  `status` tinyint(4) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `type_id` (`type_id`),
  KEY `make_id` (`make_id`),
  KEY `model_id` (`model_id`),
  KEY `fuel_id` (`fuel_id`),
  KEY `transmission_id` (`transmission_id`),
  KEY `body_id` (`body_id`),
  KEY `colour_id` (`colour_id`),
  KEY `axle_id` (`axle_id`),
  KEY `category_id` (`category_id`),
  KEY `vehicle_year` (`vehicle_year`),
  KEY `mileage` (`mileage`),
  KEY `status` (`status`),
  KEY `date_from` (`date_from`),
  KEY `date_to` (`date_to`),
  KEY `trade_or_private` (`trade_or_private`),
  KEY `doors` (`doors`),
  KEY `price` (`price`),
  KEY `engine_size` (`engine_size`),
  KEY `sub_year_id` (`sub_year_id`),
  KEY `member_id` (`member_id`),
  KEY `date_created` (`date_created`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=136237 ;

The EXPLAIN:

1   SIMPLE  vehicles    ref     type_id,make_id,status,date_from,date_to,price,mem...   type_id     4   const   85695   Using where
1   SIMPLE  members     index   PRIMARY     PRIMARY     4   NULL    3   Using where; Using index; Using join buffer
1   SIMPLE  vehicles_makes  eq_ref  PRIMARY     PRIMARY     4   tvs.vehicles.make_id    1   Using index
1   SIMPLE  vehicles_models     eq_ref  PRIMARY     PRIMARY     4   tvs.vehicles.model_id   1   Using index
1   SIMPLE  vehicles_fuel   eq_ref  PRIMARY     PRIMARY     4   tvs.vehicles.fuel_id    1   Using index
1   SIMPLE  vehicles_transmissions  eq_ref  PRIMARY     PRIMARY     4   tvs.vehicles.transmission_id    1   Using index
1   SIMPLE  vehicles_axles  eq_ref  PRIMARY     PRIMARY     4   tvs.vehicles.axle_id    1   Using index
1   SIMPLE  vehicles_sub_years  eq_ref  PRIMARY     PRIMARY     4   tvs.vehicles.sub_year_id    1   Using index
1   SIMPLE  vehicles_categories     eq_ref  PRIMARY     PRIMARY     4   tvs.vehicles.category_id    1   Using index
ChimeraTheory
  • 493
  • 1
  • 4
  • 11

5 Answers5

16

Improving the WHERE clause

Your EXPLAIN shows that MySQL is only utilizing one index (type_id) for selecting the rows that match the WHERE clause, even though you have multiple criteria in the clause.

To be able to utilize an index for all of the criteria in the WHERE clause, and to reduce the size of the result set as quickly as possible, add a multi-column index on the following columns on the vehicles table:

(status, date_from, date_to, type_id, price)

The columns should be in order of highest cardinality to least.

For example, vehicles.date_from is likely to have more distinct values than status, so put the date_from column before status, like this:

(date_from, date_to, price, type_id, status)

This should reduce the rows returned in the first part of the query execution, and should be demonstrated with a lower row count on the first line of the EXPLAIN result.

You will also notice that MySQL will use the multi-column index for the WHERE in the EXPLAIN result. If, by chance, it doesn't, you should hint or force the multi-column index.

Removing the unnecessary JOINs

It doesn't appear that you are using any fields in any of the joined tables, so remove the joins. This will remove all of the additional work of the query, and get you down to one, simple execution plan (one line in the EXPLAIN result).

Each JOINed table causes an additional lookup per row of the result set. So, if the WHERE clause selects 5,000 rows from vehicles, since you have 8 joins to vehicles, you will have 5,000 * 8 = 40,000 lookups. That's a lot to ask from your database server.

Marcus Adams
  • 53,009
  • 9
  • 91
  • 143
4

Instead of expensive calculation of precise distance for all of the rows use a bounding box and calculate the exact distance only for rows inside the box.

The simplest possible example is to calculate min/max longitude and latitude that interests you and add it to WHERE clause. This way the distance will be calculated only for a subset of rows.

WHERE
    vehicles.gps_lat > min_lat ANDd vehicles.gps_lat < max_lat AND
    vehicles.gps_lon > min_lon AND vehicles.gps_lon < max_lon

For more complex solutions see:

Community
  • 1
  • 1
Mariusz Jamro
  • 30,615
  • 24
  • 120
  • 162
3

Is you SQL faster without this?

Round(3959 * Acos(Cos(Radians(51.465436)) *
  Cos(Radians(vehicles.gps_lat)) *
  Cos(Radians(vehicles.gps_lon) - 
  Radians(-0.296482)) + 
  Sin(Radians(51.465436)) * 
  Sin(Radians(vehicles.gps_lat)))) AS distance

performing math equation is very expensive

Maybe you should consider a materialized view that pre-calculate you distance, and you can select from that view. Depending on how dynamic you data is, you may not have to refresh you data too often.

Churk
  • 4,556
  • 5
  • 22
  • 37
1

To clarify this as an answer: if you do not already have these indexes, you should consider adding them

do you also have indexes on these:

vehicles.status
vehicles.date_from
vehicles.date_to
vehicles.type_id
vehicles.price
Randy
  • 16,480
  • 1
  • 37
  • 55
1

To be a little more specific than @Randy of indexes, I believe his intention was to have a COMPOUND index to take advantage of your querying critieria... One index that is built on a MINIMUM of ...

( status, type_id, date_from )

but could be extended to include the date_to and price too, but don't know how much the index at that granular level might actually help

( status, type_id, date_from, date_to, price )

EDIT per Comments

You shouldn't need all those individual indexes... Yes, the Primary Key by itself. However, for the others, you should have compound indexes based on what your common querying criteria might be and remove the others... the engine might get confused on which might be best suited for the query. If you know you are always looking for a certain status, type and date (assuming vehicle searches), make that as one index. If the query is looking for such information, but also prices within that criteria it will already be very close on the few indexed records that qualify and fly through the price as just an extra criteria.

If you offer querying like Only Automatic vs Manual transmission regardless of year/make, then yes, that could be an index of its own. However, if you would TYPICALLY have some other "common" criteria, tack that on as a secondary that MAY be utilized in the query. Ex: if you look for Manual Transmissions that are 2-door vs 4-door, have your index on (transmission_id, category_id).

Again, you want whatever will help narrow down the field of criteria based on some "minimum" condition. If you tack on an extra column to the index that might "commonly" be applied, that should only help the performance.

DRapp
  • 47,638
  • 12
  • 72
  • 142
  • I'm not familiar with compound indexes - please see my updated post. Is my current indexing inefficient? – ChimeraTheory Feb 27 '12 at 14:49
  • Just add another index, but instead of a SINGLE column, just do as above, multiple columns separated by columns... This way, one index can have multiple components to more closely match the querying criteria. – DRapp Feb 27 '12 at 14:51
  • I have 21 indexes, which do I need to group and do I then need to remove them first? – ChimeraTheory Feb 27 '12 at 14:55
  • +1 that is what i intended... it seems the WHERE clause is probably the culprit, and a good index on those filtering columns will help a lot. – Randy Feb 27 '12 at 16:46
  • @ChimeraTheory, revised answer for index clarifications – DRapp Feb 27 '12 at 17:01
  • For each result (in my PHP loop) I also do a postcode lookup in a table with around 1.6 million UK postcode records. Would indexing this table help? It looks like this, what should be indexed: CREATE TABLE IF NOT EXISTS `post_codes` ( `id` int(11) NOT NULL AUTO_INCREMENT, `postcode` varchar(100) NOT NULL, `latitude` double(9,6) NOT NULL, `longitude` double(9,6) NOT NULL, PRIMARY KEY (`id`), KEY `postcode` (`postcode`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=1696036 ; - should I fulltext index the postcode field as I do a lookup on this? – ChimeraTheory Feb 27 '12 at 20:06