11

There are times I can use force index option to use particular index on a query to make the query faster.

Then after some times the data in that table may change. And the force index which I used may not be the right index search for that query.

My questions are

  • Is there anyway to suggest the Query Optimiser to use a force index as a possible option during its plan generation. If it is a slow plan on using the forced index then it can use normal query planning.

  • Or Else Is there anyway to edit the Code of MySql/PSQL to suggest the Optimiser to use a force index as a possible option during its plan generation.

Additional Info: I wanted to add my plan to the optimiser plan list( that Optimiser already created many plan for a query). So that while choosing the best plan for a query I wanted the optimiser to consider my plan also. If this way is possible then the optimiser need not consider force index every time. It can keep the force index as a possible option to the plan generation

I tried to make the question clear as of my knowledge. If anyone couldn't understand comment your queries.

vinieth
  • 1,204
  • 3
  • 16
  • 34

6 Answers6

6

The only way is to remove FORCE INDEX from your query.

You can override the optimizer's plan, or else you let the optimizer choose the plan.

The optimizer can't tell if using the index will be slow. It tries to make its best guess at choosing a plan that will be fast.

The only way the optimizer could be more accurate is to run both queries, and compare the performance difference, and then run the fast query. That's obviously not practical, so the optimizer is meant to estimate the cost of the query, based on some statistical model. Most of the time this is accurate enough that the optimizer's choice is best.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
5

The general concept with which you are dealing is "internal database statistics" (not an official term). That is, the information the database engine uses to make the query plan. The statistics store details like the key distribution within the indexes, table row sizes, counts, percentage of null rows, distinctness of data, and so on.

For historical (and relevant) performance reasons, the internal database statistics are not updated on-the-fly, but are typically updated periodically at the the DBA's behest. In Postgres land, this information is succinctly available via the pg_statistic table. Unfortunately, MySQL does not make this information readily available, but you can see what you can glean from the INFORMATION_SCHEMA and SHOW INDEXES.

To your problem at hand -- the fleeting usefulness of query plan hints -- is that the DB's statistics are not representative of the table data. The general response then is: update the DB's statistics of the tables in question, and you won't need to provide optimizer hints in your queries:

ANALYZE TABLE <table_name>;

There are some caveats of which to be aware:

  • Without extra measures, this analysis by the DB engine is ephemeral: it will not persist across a database restart/reboot. The options are to reanalyze tables upon DB start, or to look in to statistics persistence.

  • This generates statistics by sampling the data -- looking at a subset of random table rows. By nature, a random sampling means that the statistics gathered might, on any particular run, not be a good representation of the actual data, leading to poor query plans. You can increase the number of sample pages via innodb_stats_persistent_sample_pages

  • ANALYZE is not OPTIMIZE, and your tables might also need optimization. OPTIMIZE is an expensive operation, requiring a lock on the table, but can be hugely beneficial under certain conditions.

  • ANALYZE is table specific, and cannot make up for poor table layout. For example, I was recently tasked with speeding up a slow running set of report queries. The culprit? The schema was storing time and date columns as strings, rather than as native data types. This had huge implications as the queries first had to cast the strings to dates (including with errors in the data) before comparison, resulting in multiple entire table scans. Very bad.

  • ANALYZE cannot make up for poor queries. For example, as with the previous bullet point, something like WHERE SOME_FUNCTION( col ) > some_value means that the query planner likely cannot utilize the indexes as each row must first execute the FUNCTION to get the condition result. Not always the case, but often found in naive SQL queries.

  • ANALYZE cannot make up for poor normalization. Not withstanding the potentially large algorithmic constants, you will get far bigger big-O algorithmic wins by having proper normalization.

  • ANALYZE does not create indexes. If a query accesses columns that have high-cardinality but no indexes, too bad. Knowing on which columns to put indexes per the known or expected queries is up to the DBA.

And as usual in cases like these, take all of StackOverflow advices with grains of salt: you are in charge of the data or programs and have the actual running product in front of you. You can test, you can measure, you can poke around. We can't.

hunteke
  • 3,648
  • 1
  • 7
  • 17
4

FORCE INDEX may be good today, but bad tomorrow. This is because the data distribution changes and/or the constants in the query change.

There are a few of patterns that often lead to picking the "wrong" index:

SELECT ...
    WHERE ...
    ORDER BY ...

and it is unclear whether to optimize for the WHERE or for the ORDER BY.

SELECT ...
    FROM a
    JOIN b ON ...
    WHERE a.x ...
      AND b.y ...

and it is not obvious whether to filter on a.x or b.y first. (It cannot filter on both at the same time.)

If you provide your particular query, we may be able to provide specific hints.

Sometimes, the query can be reformulated to trick the Optimizer into taking one path versus the other. Version 8.0 has a number of "hints", but that's not necessarily better than FORCE INDEX. STRAIGHT_JOIN is another flavor of hint.

Reformulating the query may involve turning part of it into a subquery. But again, you are risking "good today, bad tomorrow".

If you have "over-normalized" (such as normalizing a datetime, then using BETWEEN), the solution may involve denormalizing.

Are you using 'composite' indexes? Is the 'range' part of the WHERE last in the index? (And a lot of other questions -- let's see the specific query.)

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • There is no specific query I'm dealing with. I wanted to learn more about query optimiser. I wanted to add my plan to the optimiser plan list( that Optimiser already created many plan for a query). So that while choosing the best plan for a query I wanted the optimiser to consider my plan also. If this way is possible the optimiser need not consider force index every time. It can keep the force index as a possible option to the plan generation. Note:(Force Index is an example I'm telling) – vinieth Apr 18 '18 at 05:07
  • @vinieth, most likely optimiser is considering your "best" plan already, but, based on its cost model it decides that the cost of the "best" plan is worse than the cost of some other plan, so it doesn't use the "best" plan. – Vladimir Baranov Apr 18 '18 at 11:27
  • @VladimirBaranov But there some cases it doesn't find best plan. At that if a suggestion plan from me to optimiser will be good. Is there any way for that. By change the code or something else. (I need to an add extra plan to the optimiser plan list always for each different query i execute). – vinieth Apr 18 '18 at 11:39
  • As I point out, the Optimizer can't always find the 'best' plan. It is often due to limited insight into the data distribution in the table(s). – Rick James Apr 18 '18 at 12:50
  • @vinieth - for learning more: http://mysql.rjweb.org/doc.php/index_cookbook_mysql – Rick James Apr 18 '18 at 12:50
  • @RickJames Already read Bro. I usually refer your documents when i have doubt. Any extra deep solution by you for this question. – vinieth Apr 18 '18 at 12:59
  • @RickJames And also try to give solution to this question. [Link_to_the _Question](https://stackoverflow.com/questions/49809452/wild-card-before-and-after-a-word-mysql-psql) – vinieth Apr 18 '18 at 13:01
  • My doc covers the things that I can easily discuss in generalities. For more advanced things, I need to work from specific queries. Indexing is a huge subject. At some point, I have to say, "go read the many documents on the web" or "read the code". The thing I try to bring to the table is "how to apply the info". – Rick James Apr 18 '18 at 13:50
  • @RickJames Can you answer this question https://stackoverflow.com/questions/53775049/check-data-dictionary-size – vinieth Dec 14 '18 at 07:21
  • @RickJames Can you help me with this question https://stackoverflow.com/questions/55534892/store-count-in-cache?noredirect=1#comment97773315_55534892 – vinieth Apr 05 '19 at 12:11
  • @RickJames Can you checkout this question https://stackoverflow.com/questions/58065891/varchar-vs-text-performance-when-data-fits-on-row – vinieth Sep 29 '19 at 13:28
4

As far as I know,

Mysql is not good enough to support your requirement as lack of some kind of statistics and mechanism, for example, histogram, dynamic sampling and etc.

So if data changes, although collect statistics in time, lack of necessary things described previous, the optimizer may not behaviour as you expect.

Your two questions may be good practice, but is hard for final users.

Or you could advice for oracle or maria db team?

Martin54
  • 1,349
  • 2
  • 13
  • 34
yaoweijq
  • 263
  • 2
  • 12
4

Q: Is there anyway to suggest the Query Optimiser to use a force index as a possible option during its plan generation. If it is a slow plan on using the forced index then it can use normal query planning.

A: No. The optimizer does what it does with what its given. The optimizer doesn't ignore hints. Any hints included in the SQL text have an influence.

This is why hints are a "last resort". Our preference is to have the optimizer choose an appropriate plan. And the best way to make that happen is to carefully write SQL to avoid constructs that prevent appropriate indexes from being used (for example, wrapping a column in function), to have suitable indexes available, and to make sure statistics are accurate and up to date.

Q: Is there anyway to edit the Code of MySql/PSQL to suggest the Optimiser to use a force index as a possible option during its plan generation?

A: The optimizer will consider all of the indexes, and choose to use an particular index if the candidate access plan has the lowest cost estimate. This doesn't require a FORCE INDEX hint; MySQL will be considering the index.


Using hints in the SQL text is a last resort. Before we consider using hints (FORCE INDEX or otherwise), we should

  • use SQL constructs that allow (don't prevent) usage of appropriate index. for example, with a DATETIME column dt, MySQL can use range scan operation for this dt >= '2018-04-22' AND dt < '2018-04-23' but not for this DATE(dt) = '2018-04-22'. Sometimes query performance issues can be addressed with some simple tweaks to the SQL text. Sometimes a more extensive re-design and re-write.

  • have suitable indexes available (for example, consider composite indexes with appropriate leading columns, and remove redundant indexes on singleton column)

  • ensure statistics are up-to-date and are representative of the table (Note that InnoDB gathers statistics using a small sample of pages. If the pages that get sampled aren't representative of the table, we can get statistics that are skewed.

spencer7593
  • 106,611
  • 15
  • 112
  • 140
0

My Question is, have you ever experienced slow execution because of force index?

If not,
go with optimizer's choice. It skips force index in bad situations
ex: select last_name from employees force index(idx_last_name) order by last_name;
in above query, it skips force index

If yes,

You can override the optimizer's plan based on your observations.

Community
  • 1
  • 1
Jameel Grand
  • 2,294
  • 16
  • 32
  • I wanted to add my plan to the optimiser plan list( that Optimiser already created many plan for a query). So that while choosing the best plan for a query I wanted the optimiser to consider my plan also. If this way is possible then the optimiser need not consider force index every time. It can keep the force index as a possible option to the plan generation. – vinieth Apr 24 '18 at 11:10