0

I've read multiple questions in here but none could help me so far. For the same query and table structure on my previous [unanswered] question Optimizing a SELECT … UNION … query with ORDER and LIMIT on a table with 5M+ rows besides having all the indexes defined, the query is still logged as "not using index".

SELECT  `id`, `title`, `title_fa`
    FROM  
      ( SELECT  `p`.`id` AS `id`, `p`.`title` AS `title`, `p`.`title_fa` AS `title_fa`,
                `p`.`unique` AS `unique`, `p`.`date` AS `date`
            FROM  `articles` `p`
            LEFT JOIN  `authors` `a`  ON `p`.`unique` = `a`.`unique`
            WHERE  1
              AND  MATCH (`p`.`title`) AGAINST ('"heat"' IN BOOLEAN MODE)
            UNION 
         SELECT  `p`.`id` AS `id`, `p`.`title` AS `title`, `p`.`title_fa` AS `title_fa`,
                `p`.`unique` AS `unique`, `p`.`date` AS `date`
            FROM  `articles` `p`
            LEFT JOIN  `authors` `a`  ON `p`.`unique` = `a`.`unique`
            WHERE  1
              AND  MATCH (`p`.`title_fa`) AGAINST ('"گرما"' IN BOOLEAN MODE)
      ) AS `subQuery`
    GROUP BY  `unique`
    ORDER BY  `date` DESC
    LIMIT  0,10;

I don't know how should I use an index in the outer SELECT where it's grouping the two SELECTs using UNION.

Thanks

Update

This is the structure of the article table:

CREATE TABLE `articles` (
  `id` int(10) unsigned NOT NULL,
  `title` text COLLATE utf8_persian_ci NOT NULL,
  `title_fa` text COLLATE utf8_persian_ci NOT NULL,
  `description` text COLLATE utf8_persian_ci NOT NULL,
  `description_fa` text COLLATE utf8_persian_ci NOT NULL,
  `date` date NOT NULL,
  `unique` tinytext COLLATE utf8_persian_ci NOT NULL,
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_persian_ci;

ALTER TABLE `articles`
  ADD PRIMARY KEY (`id`),
  ADD KEY `unique` (`unique`(128)),
  ADD FULLTEXT KEY `TtlDesc` (`title`,`description`);
  ADD FULLTEXT KEY `Title` (`title`);
  ADD FULLTEXT KEY `faTtlDesc` (`title_fa`,`description_fa`);
  ADD FULLTEXT KEY `faTitle` (`title_fa`);
  MODIFY `id` int(10) unsigned NOT NULL AUTO_INCREMENT;

UPDATE 2:

Here is the output of EXPLAIN SELECT (I didn't know how to get it from phpMyAdmin any better! Sorry if it doesn't look good):

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   PRIMARY <derived2>  ALL NULL    NULL    NULL    NULL    4   Using temporary; Using filesort
2   DERIVED p   fulltext    title   title   0   NULL    1   Using where
3   UNION   p   fulltext    title_fa    title_fa    0   NULL    1   Using where
NULL    UNION RESULT    <union2,3>  ALL NULL    NULL    NULL    NULL    NULL    Using temporary
Community
  • 1
  • 1
SAVAFA
  • 818
  • 8
  • 23
  • Please provide `SHOW CREATE TABLE` for both tables. – Rick James Aug 02 '16 at 20:33
  • How well does `FULLTEXT` work with Arabic? (I thought it was mostly limited to English.) – Rick James Aug 02 '16 at 20:37
  • @RickJames, I've added the structure of the `article` table. You can simply forget about the `authors` table and remove its join. That doesn't effect – SAVAFA Aug 03 '16 at 04:56
  • The fulltext works flawless in my practice and haven't had any problem with it. By the way, it's Persian (Farsi) and not Arabic ;) – SAVAFA Aug 03 '16 at 05:01
  • Is there some way I can distinguish between Persian and Arabic without being able to read either? (I saw that the characters were utf8 for "ARABIC LETTER ALEF", etc.) – Rick James Aug 03 '16 at 05:15
  • Hmmm... there seem to be two different keys with the same name `faTtlDesc`?? – Rick James Aug 03 '16 at 05:17
  • Please provide the output from `EXPLAIN SELECT ...`. – Rick James Aug 03 '16 at 05:25
  • @RickJames, I've corrected the key names (my bad in writing it in here). Also I've added the `EXPLAIN SELECT` output to the problem body. – SAVAFA Aug 04 '16 at 20:43
  • If by "not using index", do you mean that `EXPLAIN` does not say `Using index`? That is, the index is not "covering"? Well, that is true. There is no "covering index" for this query. – Rick James Aug 04 '16 at 20:53
  • Yes I mean that, and also these queries are logged in the mysql log of queries not using index. Isn't that a problem? – SAVAFA Aug 05 '16 at 12:19
  • I added to my answer. Maybe this time I have gotten to the heart of your question. – Rick James Aug 05 '16 at 15:11

2 Answers2

3

) ASsubQuery

It is a subquery, a derived table, and it is manifested coming out of a temporary table. It has no chance of index use.

As I wrote in this answer:

The document Derived Tables in MySQL 5.7 describes it well for versions 5.6 and 5.7, where the latter will provide no penalty due to the change in materialized derived table output being incorporated into the outer query. In prior versions, substantial overhead was endured with temporary tables with the derived.

Community
  • 1
  • 1
Drew
  • 24,851
  • 10
  • 43
  • 78
  • Thanks, does this mean it doesn't need it also? And is this also the source of problem with high `Handler_read_rnd_next`, `Handler_read_rnd ` etc.? – SAVAFA Aug 02 '16 at 19:22
  • For the Handler I would read anything written by someone having done research on that and willing to share it like [Percona](https://www.percona.com/blog/2010/06/15/what-does-handler_read_rnd-mean/). What version of mysql are you on? – Drew Aug 02 '16 at 19:31
  • MySQL version: 5.6.29-log. For those two it's been said that they are showing full table scan without proper indexing. I'm wondering if these joins without indexes are counted in here causing these variables to be high. – SAVAFA Aug 02 '16 at 19:32
  • 5.6 has no chance of using an index in derived's (on the outside using the derived) – Drew Aug 02 '16 at 19:37
  • So, there is no more chance of improving this query in your opinion? – SAVAFA Aug 04 '16 at 20:44
  • If I had all the data, sure there is a chance. I don't. You have many things working against it here. The Union. The 2nd part of what is being unioned (the FTS). Then the outside being a derived on 5.6 as a Temp table. – Drew Aug 04 '16 at 20:53
  • The data in larger that 10Gb, making it impossible to share. I need to union those two parts and both need to be FTS. – SAVAFA Aug 05 '16 at 12:21
  • So it is not possible for me to add anything more. Let us know if we have been helpful. – Drew Aug 05 '16 at 12:26
  • Maybe the `UNION` is not necessary? See my answer again. – Rick James Aug 05 '16 at 15:17
  • @RickJames as I mentioned the table is so big that I cannot simply add an index only for testing. Do you know any simpler way? – SAVAFA Aug 17 '16 at 04:24
  • @SAVAFA it is a derived table in a temp. We don't get paid but by upvotes for being helpful and we have asked for a dump. What are you expecting from us? – Drew Aug 17 '16 at 04:27
  • The expectations of some people I don't get it. – Drew Aug 17 '16 at 04:28
  • Thanks for the help. I didn't meant being rude! I just stated my situation. – SAVAFA Aug 17 '16 at 04:30
  • `Handler_*` values are often (not always) good clues at to how many times a table (or index) is scanned and whether it is written to a tmp table, then reread. And they are handy when experimenting with a table that is too small to get good timings from. – Rick James Aug 17 '16 at 22:03
1

When there is a MATCH clause, only a FULLTEXT index will be used.

Meanwhile, tips on syntax and pagination:

The usual pattern:

( SELECT ...
    GROUP BY ... ORDER BY ... -- apply to result of inner SELECT
)
UNION ALL
( SELECT ...
    GROUP BY ... ORDER BY ... -- apply to result of inner SELECT
)
GROUP BY ... ORDER BY ... -- apply to result of UNION

(If you need pagination, see my blog .)

Addenda

In the EXPLAIN... The 1st and 4th lines say ALL and NULL -- this indicates that no index was used in any way. In those cases, we are talking about 4 rows, and all 4 rows are needed. So, do not worry that no INDEX was used.

In the 2nd and 3rd lines, a FULLTEXT index was used.

The phrase Using index (which does not show in your EXPLAIN) does not mean "using some index", it means "using only the index". To elaborate... The data for a table is in one place, the index is in another. When all the necessary columns are in the index, the query does not need to reach over into the data. This is labeled as Using index, and it is termed a "covering index". This particular situation is not relevant for your query.

A similar phrase, Using index condition, means something else. It says that the WHERE clause can be handled by the storage engine, and does not need to involve the handler. Let's simply say that it is an optimization making things run a little faster.

Bottom line: You query is well written, and your indexes are fine for this query.

Maybe no UNION?

Try getting rid of the UNION and simply search for both strings at the same time:

FULLTEXT(title, title_fa)
MATCH (title, title_fa) AGAINST ('"heat" "گرما"' IN BOOLEAN MODE)

If that does not work, then explain what goes wrong.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • You seem to need the subquery because you want to remove `unique` and `date` columns. Some performance improvement could be had by moving GROUP/ORDER/LIMIT inside the subquery. – Rick James Aug 02 '16 at 20:36
  • I've tested different approaches for this query as mentioned in this post: (http://stackoverflow.com/questions/38495389/optimizing-a-select-union-query-with-order-and-limit-on-a-table-with-5m). There is duplicate results problem with the query you proposed. – SAVAFA Aug 03 '16 at 05:00
  • To get rid of the dup results, change `UNION ALL` to `UNION DISTINCT`. – Rick James Aug 04 '16 at 20:50
  • I could manage to get rid of the duplicates but the execution time doesn't change significantly – SAVAFA Aug 17 '16 at 04:27