2

On one of my PHP/MySQL sites, every user can block every other user on the site. These blocks are stored in a Blocked table with each row representing who did the blocking and who is the target of the block. The columns are indexed for faster retrieval of a user's entire "block list".

For each user, we must exclude from any search results any user that appears in their block list.

In order to do that, is it better to:

1) Generate the "block list" whenever the user logs in by querying the Blocked table once at login and saving it to the $_SESSION (and re-querying any time they make a change to their "block list" and re-saving it to the $_SESSION), and then querying as such:

NOT IN ($commaSeparatedListFromSession)

or

2) Exclude the blocked users in "real-time" directly in the query by using a sub-query for each user's search query as such:

NOT IN (SELECT userid FROM Blocked WHERE Blocked.from = $currentUserID) ?

ProgrammerGirl
  • 3,157
  • 7
  • 45
  • 82
  • Go with the sub-select. Allow the DB to do what it is good at: i.e. filtering at the earliest opportunity. Also, a good DB won't work slower just because you re-frame this as a JOIN. So, don;t do so unless you find the DB is not able to handle the query in its more intuitive form. – Darius X. Apr 26 '13 at 15:14

3 Answers3

2

If the website is PHP and the blocklist is less than say 100 total per user I would store it in a table, load it to $_SESSION when changed/loggging in. You could just as easily load it from SQL on each page load into a local variable however.

What I would store in $_SESSION is a flag 'has_blocklist_contents' that would decide whether or not you should load or check the blocklist on page load.

Instead of then using a NOT IN with all of your queries the list I think it might be smarter to filter them out using PHP.

I have two reasons for wanting to implement this way:

  1. Your database can re-use the SQL for all users on the system resulting in a performance boost for retrieving comments and such.
  2. Your block list will most of the time be empty, so you're not adding any processing time for the majority of users.
Brent
  • 304
  • 1
  • 7
  • I think this should be the accepted answer since blocking will only apply to a minority of users. Create an array of blocked IDs on log in then compare with MySQL results each time in script rather than draining MySQL resources for everyone. There will always be a cost for this functionality. This way seems the least costly. – Colin R. Turner Oct 01 '19 at 12:37
1

I think there is 3rd solution to it. In my opinion this would be the better way to go.

If you can write this

NOT IN (SELECT userid FROM Blocked WHERE Blocked.from = $currentUserID)

Then you can surely write this.

....
SomeTable st
LEFT JOIN
Blocked b
ON( st.userid = b.userid AND Blocked.from = $currentUserID)
WHERE b.primaryKey IS NULL;

I hope you understand what I mean by the above query. This way you get the best of both worlds i.e. You don't have to run 2 queries, and you don't have to save data in $_SESSION

Manu
  • 901
  • 1
  • 8
  • 28
  • Interesting solution! Is this actually faster than doing a sub-query with a **covered** index? – ProgrammerGirl Apr 25 '13 at 17:56
  • Yes, this would be faster than doing sub-query. – Manu Apr 25 '13 at 18:01
  • How come it's faster than doing the sub-query? Also, shouldn't it be `IS NULL;` at the end instead of `IS NOT NULL;` if you want to show everyone except the blocked users? – ProgrammerGirl Apr 25 '13 at 19:11
  • Ya, my bad. It should be `IS NULL` . I am going to edit my comment. :D – Manu Apr 26 '13 at 06:35
  • Sub-queries are generally slower than Joins, because subqueries involve executing more than 1 query. – Manu Apr 26 '13 at 06:37
  • [Read this](http://stackoverflow.com/questions/2577174/join-vs-subquery) to understand why sub-queries are slower than joins. – Manu Apr 26 '13 at 18:51
0

Don't use the $_SESSION as a substitute for a proper caching system. The more junk you pile into $_SESSION, the more you'll have to load for each and every request.

Using a sub-select for exclusions can be brutally slow if you're not careful to keep your database tuned. Make sure your indexes are covering all your WHERE conditions.

tadman
  • 208,517
  • 23
  • 234
  • 262