1

I have a master table which has details.

I wanted to find all the combinations for a product in that session with every other product in that particular sessions for all sessions.

create table combinations as 
select
    a.main_id,
    a.sub_id as sub_id_x,
    b.sub_id as sub_id_y,
    count(*) as count1,
    a.dates as rundate 
from
    master_table a 
    left join
        master_table b 
        on a.session_id = b.session_id 
        and a.visit_number = b.visit_number 
        and a.main_id = b.main_id 
        and a.sub_id != b.sub_id 
where
    a.sub_id is not null 
    and b.sub_id is not null 
group by
    a.main_id,
    a.sub_id,
    b.sub_id,
    rundate;

I did a explain on query

+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra                                              |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------------------------------------------+
|  1 | SIMPLE      | a     | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 298148 |    90.00 | Using where; Using temporary; Using filesort       |
|  1 | SIMPLE      | b     | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 298148 |     0.08 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------------------------------------------+

The main issue is, my master table consists of 80 million rows. This query is taking more than 24 hours to execute.

All the columns are indexed and I am doing a self join.

Would creating a like table first 'master_table_2' and then doing a join would make my query faster?

Is there any way to optimize the query time?

James Z
  • 12,209
  • 10
  • 24
  • 44
Shubham R
  • 7,382
  • 18
  • 53
  • 119

1 Answers1

0

As your table consists of a lot of rows, the join query will take a lot of time if it is not optimized properly and the WHERE clause is not used properly. But an optimized query could save your time and effort. The following link has a good explanation about the optimization of the join queries and its facts - Optimization of Join Queries

@Marcus Adams has already provided a similar answer here

Another option is you can select individually and process in the code end for the optimization. But it is only applicable in some specific conditions only. You will have to try to compare both processes (join query and code end execution) and check the performance. I have got better performance once using this method.

Suppose a join query is like as the following -

SELECT A.a1, B.b1, A.a2
FROM A
INNER JOIN B
ON A.a3=B.b3 
WHERE B.b3=C;

What I am trying to say is query individually from A and B satisfying the necessary conditions and then try to get your desired result from the code end.

N.B. : It is an unorthodox way and it could not be taken as granted to be applicable in all criteria.

Hope it helps.

Community
  • 1
  • 1
masud_moni
  • 1,121
  • 16
  • 33
  • i didn't get your 'Another option' sir. If you please have lil time, can you have queries for me or a simple explaination how to select individually – Shubham R Dec 22 '16 at 09:25
  • I have updated the answer. Please check this if it can help. And no 'sir' please. :) I will be happy if I can be of any help. @SRingne – masud_moni Dec 22 '16 at 10:54