1
SELECT journey.code, journey.departure, journey.end, group_concat(pattern_road.latitude)
FROM journey
INNER JOIN journey_day ON journey_day.journey = journey.code
INNER JOIN pattern ON pattern.code = journey.pattern
LEFT JOIN pattern_road ON pattern_road.section = pattern.section
WHERE journey_day.day = 5 AND TIME(NOW()) BETWEEN journey.departure AND journey.end
GROUP BY journey.code

The above query takes roughly 100ms to execute. I am happy with this, but I now need to add another join, when I try to do this the query slows down to roughly 2 seconds.

This is the new slow query with the extra join:

SELECT journey.code, journey.departure, journey.end, group_concat(pattern_road.latitude) AS road, group_concat(link.stop) AS stop
FROM journey
INNER JOIN journey_day ON journey_day.journey = journey.code
INNER JOIN pattern ON pattern.code = journey.pattern
LEFT JOIN pattern_road ON pattern_road.section = pattern.section
INNER JOIN link ON link.section = pattern.section
WHERE journey_day.day = 5 AND TIME(NOW()) BETWEEN journey.departure AND journey.end
GROUP BY journey.code

Things to note:

The extra join also operates on the same column as the previous join (pattern_road) joins on, I can only think that this must be the cause of the problem. For example, if I replace pattern_road join with the link join, the query is back to 100ms, I just can't use both joins and have it run at 100ms.

Database schematic/indexes in SQL Fiddle

Any ideas why this is happening? Thanks in advance.

jskidd3
  • 4,609
  • 15
  • 63
  • 127
  • you need to index link on column section – Jasen Dec 27 '14 at 10:56
  • @Jasen I already have that index in place. I have linked an SQL Fiddle in the question so that you can see existing indexes and table structure. – jskidd3 Dec 27 '14 at 10:57
  • performance tests are impossible without data, maybe you could be running out of memory o – Jasen Dec 27 '14 at 11:06
  • To my way of thinking, there is almost no problem for which GROUP_CONCAT is the solution - especially so when it's the only aggregating function in the query. Why not omit it altogether? – Strawberry Dec 27 '14 at 12:00

2 Answers2

0

Can you try this way?

SELECT  SQL_NO_CACHE journey.code, journey.departure, journey.end,
  (select group_concat(pattern_road.latitude) from  pattern_road where pattern_road.section = pattern.section) AS road,
  (select group_concat(link.stop) from link where link.section = pattern.section) AS stop
FROM journey
INNER JOIN journey_day ON journey_day.journey = journey.code
INNER JOIN pattern ON pattern.code = journey.pattern
WHERE journey_day.day = 5 AND TIME(NOW()) BETWEEN journey.departure AND journey.end
GROUP BY journey.code            
Lajos Veres
  • 13,595
  • 7
  • 43
  • 56
  • Much better!! Thanks so much. That reduces the query back to down to 100ms. Just for reference, in the query you missed the `left join pattern_road` but referenced it in `select` in the answer, if I add that on it works perfectly :) – jskidd3 Dec 27 '14 at 11:03
  • Oh, 1 moment, maybe you don't need to add anything... just need to check the query again – jskidd3 Dec 27 '14 at 11:06
  • Oddly your query takes 6 seconds to run. However, if I add a left join like so: `LEFT JOIN pattern_road ON pattern_road.section = pattern.section` after the pattern join, it takes 100ms! No idea why this is happening – jskidd3 Dec 27 '14 at 11:08
  • Can you try to add SQL_NO_CACHE after the select and before the rows? – Lajos Veres Dec 27 '14 at 11:10
  • Could you edit your question and put SQL_NO_CACHE in it? Not quite sure where to put it – jskidd3 Dec 27 '14 at 11:13
  • No change with `SQL_NO_CACHE` added on – jskidd3 Dec 27 '14 at 11:16
  • The SQL_NO_CACHE should only turn off query caching which should make the comparison better. Without it if you run the same query 2 times, the second one can be much quicker if there wasn't any modification in the related tables between the 2 execution. – Lajos Veres Dec 27 '14 at 11:19
0

Remember that two indexes on two columns are not the same as one composite index on two columns. A query can only use one index per table*.

Add an index on pattern(code, section) or pattern(section, code) (a live test with the actual data is required to find out which one provides better result).


* except in very rare cases

Community
  • 1
  • 1
RandomSeed
  • 29,301
  • 6
  • 52
  • 87