0

I have two tables: all_users and vip_users

all_users table has a list of all users (you don't say?) in my system and it currently has around 57k records, while vip_users table has around 37k records.

Primary key in both tables is an autoincrement id field. all_users table is big in terms of attribute count (around 20, one of them is email), while vip_users table has only (along with id) email attribute.

I wanted to query out the "nonVip" users by doing this (with help of this question here on SO):

SELECT all_users.id, all_users.email
FROM all_users 
LEFT OUTER JOIN vip_users 
    ON (all_users.email=vip_users.email) 
WHERE vip_users.email IS NULL

And now, finally coming to the problem - I ran this query in phpmyadmin and even after 20 minutes I was forced to close it and restart httpd service as it was taking too long to complete, my server load jumped over 2 and the site (which also queries the database) became useless as it was just loading too slow. So, my question is - how do I make this query? Do I make some script and run it over night - not using phpmyadmin (is this maybe where the problem lies?), or do I need to use different SQL query?

Please help with your thoughts on this.

Community
  • 1
  • 1
Nikola
  • 14,888
  • 21
  • 101
  • 165

3 Answers3

3

Try indexing the fields email on both tables, that should speed up the query

CREATE INDEX useremail ON all_users(email)

CREATE INDEX vipemail ON vip_users(email)
Nico
  • 473
  • 3
  • 11
  • oh, you were so right!, this now gives me the answer in quoting: "Showing rows 0 - 29 ( 21,714 total, Query took 0.0043 sec)" – Nikola May 22 '12 at 18:54
0

I think NOT IN is faster and used less resource than LEFT OUTER JOIN.

Can you try -

SELECT *
FROM all_users
WHERE id NOT IN (SELECT id 
                 FROM vip_users
                 WHERE email IS NULL);
JHS
  • 7,761
  • 2
  • 29
  • 53
  • In MySQL, you specifically do NOT want to use a subselect. A join is much faster. EDIT- I just doublechecked, and this may no longer be true in MySQL >= version 6.0. – jwismar May 22 '12 at 16:28
  • @jwismar: No, you are confusing the behaviour of `IN (SELECT ...)` - which is often slow - with the behaviour of `NOT IN (SELECT ...)` which rarely has any issues. Similar performance with `LEFT JOIN / IS NULL` or `NOT EXISTS (subquery)` – ypercubeᵀᴹ May 22 '12 at 16:35
0

As written, you're not getting the results you're looking for. You're looking for vip_users rows where the email matches an email in users, and is also NULL.

Is there a reason you want vip_users to have a separate id from users? If you change the vip_users id field to a fk on the users id field, yo would then change your select to:

SELECT all_users.id, all_users.email
FROM all_users 
LEFT OUTER JOIN vip_users 
    ON (all_users.id=vip_users.id) 
WHERE vip_users.email IS NULL;

There's no reason this query should take any discernible about of time. 37k records is not a very big table....

jwismar
  • 12,164
  • 3
  • 32
  • 44