1

i don't know why this query take 45 min to give me the result

table1= 831293 row

table2= 2000.000 row

SELECT ID, name FROM table1 WHERE name not IN (SELECT name FROM table2 )  LIMIT 831293

my.cnf file look like this :

max_allowed_packet  = 64M
innodb_buffer_pool_size = 8G
innodb_log_file_size    = 256M 
query_cache_type        = 1
query_cache_limit   = 1M
Ilyas Oirraq
  • 43
  • 1
  • 7

4 Answers4

1

This select is very expensive because in the worst case for every record in table1 you must go through 2.000.000 records in table2. I am not MySQL specialist, but on the Oracle platform I would consider indexing and table spaces for table2.

BTW, the LIMIT part is useless.

Michal Krasny
  • 5,434
  • 7
  • 36
  • 64
  • i'm using limit to get a specific number of rows from table 1 – Ilyas Oirraq Jun 11 '14 at 20:06
  • I think, that in this case the LIMIT statement limits the whole result, not the table 1 rows. If you want to limit table 1, then use something like this SELECT ID, name FROM (select ID, name from table1 LIMIT 831293) WHERE name not IN (SELECT name FROM table2 ) – Michal Krasny Jun 12 '14 at 06:49
1

Do you have an index on name?

If you have to loop each time through table2 per row, you effectively loop through 831293*200000 rows, which is a lot.

When you have an index on name, you can get a major performance increase, as it is easier to search for a name in this index. (probably even O(1) as there will probably be a hashtable)

you can do this as follows

ALTER TABLE `table2` ADD INDEX (`name`)
Pinna_be
  • 4,517
  • 2
  • 18
  • 34
  • Thomas Thank u dude your absolutly right, can u give me more explanation about how index on name make my query so much faster – Ilyas Oirraq Jun 11 '14 at 20:18
  • I,d like to, but here is an explanation that will be far better then mine: http://stackoverflow.com/questions/3567981/how-do-mysql-indexes-work – Pinna_be Jun 11 '14 at 20:52
1

agreed on: (a) add the indexes and (b) the limit clause is useless.

consider a 'left outer join' - it will bring all rows from the "left" table even where no join value in the "right" table exists - then you filter out the non-desired "right" values.

something like:

SELECT t1.ID, t1.name 
FROM table1 t1 left outer join table2 t2 
on t1.name = t2.name
where t2.name is null;

note: sql above assumes name is unique in t2, and not unique in t1.

matt stucky
  • 165
  • 2
  • 8
0

I think using Exist will be faster.Try This

select Id,name from
(
    select table1.ID as ID, table1.name as name  
    FROM table1 inner join table2 
    on table1.name=table2.name
)
    WHERE not EXISTS
    (
    select 1  
    FROM table1 inner join table2 
    on table1.name=table2.name);
VJ Hil
  • 904
  • 6
  • 15
  • i get this message when i try to execute not exists ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE not EXISTS – Ilyas Oirraq Jun 11 '14 at 20:09