0

Terrible title, sorry for not being able to concisely articulate this question.

  • I have a MySQL table (Table name: users) with 2m+ users (rows) in it. Each one has a score for that user. (Columns: userid, name, score)
  • I need to apply 'interests' to each user, so I have created an 'interests' table with columns (Columns: userid, interest). A new row is created each time an interest is assigned to a user.
  • I then need to select 50 users where their interest = 'surfing' and their score is between 10,000 and 50,000

There might be 500,000 users with a score in that range.

My query:

SELECT
  a.userid,
  a.interest,
  b.name,
  b.score
FROM interests AS a LEFT JOIN (SELECT
                                 userid,
                                 name,
                                 score
                               FROM users
                               WHERE score > 10000 AND score < 50000) AS b ON a.userid = b.userid
WHERE a.interest = 'surfing'
ORDER BY b.score DESC
LIMIT 50

So I think my above query will work, but I'm not sure I'm going about it in an efficient way. My understanding is that it's essentially selecting all interests rows where the interest = 'surfing' (this might be 50,000 rows) then performing a JOIN on the user table which itself might return 500,000 rows.

Dharman
  • 30,962
  • 25
  • 85
  • 135

2 Answers2

1

You perhaps do not need a derived query for join and do as

select
a.userid,
a.interest,
b.name,
b.score
from interests a 
LEFT JOIN users b on b.userid = a.userid
where 
b.score > 10000 AND b.score < 50000
and a.interest = 'surfing'
ORDER BY b.score DESC
LIMIT 50

And adding some indexes would make it faster, if userid is a primary key on user table then you do not need to re-index it on the same table.

alter table interests add index user_inter_idx(interest,userid);
alter table users add index user_score_idx(score);

NOTE : Make sure to take a backup of the tables before applying indexes on them.

You can also check the query health using

explain select ...

This will provide you an idea how the optimizer will work on the query.

Abhik Chakraborty
  • 44,654
  • 6
  • 52
  • 63
0
select 
  i.userid, 
  i.interest, 
  u.name, 
  u.score 
from 
  interests i    

  inner join users u on 
    i.userid = u.userid 
where 
  u.score between 1000 and 50000 and 
  i.interest = 'surfing' 
order by 
  u.score desc 
limit 50

Remember to add the following indexes:

  1. INTERESTS: userid, interest
  2. USERS: userid, score
javier_domenech
  • 5,995
  • 6
  • 37
  • 59
  • Make sure that you have an index on userid on the interests table. In general foreign keys should have indexes on them. - this will speed up the join. You can also include or add indexes on interest and score to play around with performance gains. – Ryan-Neal Mes Apr 23 '15 at 13:41
  • Thank you. I think your answer is a little easier to follow but figured I should select Abhik's because he was first. – freefishfoundry Apr 23 '15 at 14:28
  • No worries, just be aware that using LEFT join instead of INNER will be likely slower. http://stackoverflow.com/questions/2726657/inner-join-vs-left-join-performance-in-sql-server . You could test the performance by yourself and give us feedback :P – javier_domenech Apr 23 '15 at 14:34
  • Thanks for the follow up, I actually read both answers as using INNER JOIN. I'll post the performance results shortly but in the meantime your response best answers my question. – freefishfoundry Apr 23 '15 at 21:14